在Oracle数据库中,我们可以通过dump各种类型的文件来查看其格式。Oracle还有一些DSI的文档来介绍不同数据类型和数据块的内部存储结构。很多Oracle的爱好者,通过DSI文档和自己对dump的内容做分析,写了一些比较不错的,很有用的工具,比如AUL。Oracle内部也有,这个工具是dul。MySQL虽然是开源的,好像很少有人专门写这些工具。也很少有比较完善的命令。如果从源码量方面来考虑的话,MySQL和Oracle还是没法比的。
Page是整个InnoDB存储的最基本构件,也是InnoDB磁盘管理的最小单位,与数据库相关的所有内容都存储在这种Page结构里。Page分为几种类型,常见的页类型有数据页(B-tree Node)Undo页(Undo Log Page)系统页(System Page) 事务数据页(Transaction System Page)等。单个Page的大小是16K,每个Page使用一个8biy的int值来唯一标识,对应InnoDB最大可以存64TB。
发现Google的工程师Jeremy Cole用ruby写的工具很不错
https://github.com/jeremycole/innodb_ruby
https://github.com/jeremycole/innodb_ruby/wiki
https://github.com/jeremycole/innodb_diagrams
https://github.com/jeremycole/mysql_binlog
下面是Jeremy Cole关于InnoDB的相关链接
- On learning InnoDB: A journey to the core: An introduction to theinnodb_rubyandinnodb_diagramsprojects.
- A quick introduction to innodb_ruby: How to set upinnodb_rubyand a few demos of what it can do.
- The basics of InnoDB space file layout: How InnoDB structures its space files and the pages they contain.
- Page management in InnoDB space files: Structures related to management of file segments, extents, and pages within space files.
- Exploring InnoDB page management with innodb_ruby: Interactive exploration of the page management data structures from a real InnoDB space file.
- The physical structure of InnoDB index pages: A description of InnoDB’s index pages, where data is stored, and how records are placed in them.
- B+Tree index structures in InnoDB: A logical high-level exploration of InnoDB’s B+Tree indexes and their efficiency.
- The physical structure of records in InnoDB: A low-level illustration of InnoDB’s row storage formats.
- Efficiently traversing InnoDB B+Trees with the page directory: A deep examination of efficiency in traversing B+Trees in InnoDB.
- InnoDB bugs found during research on InnoDB data storage: An explanation about 7 different bugs found during the research for this work
- How does InnoDB behave without a Primary Key?: A short discussion about InnoDB’s implicitROW_IDcolumn which is used in tables without a suitablePRIMARY KEY.
- InnoDB Tidbit: The doublewrite buffer wastes 32 pages (512 KiB): How the file segment allocation used in InnoDB plus a bit of programming laziness caused 512 KiB to be wasted in every InnoDB system tablespace.
- The basics of the InnoDB undo logging and history system: A short introduction to multi-version concurrency control, undo logging, InnoDB’s history system, and how they are all related.
- A little fun with InnoDB multi-versioning: A fun and somewhat scary look at the “hidden” effects of multi-versioning and InnoDB’s history system.
- InnoDB with reduced page sizes wastes up to 6% of disk space: InnoDB’s required bookkeeping information for each extent wastes many pages, and it gets a lot worse with reduced (4k or 8k) page sizes.
- Visualizing the impact of ordered vs. random index insertion in InnoDB: Using thespace-lsn-age-illustrateandspace-extents-illustratemodes ofinnodb_spaceto visualize the efficiency of index builds.
通过下面的步骤,我们将一步步的开启探索之旅
安装ruby
在Linux上,我们可以通过yum或者源码来安装ruby
yum方式,如果配置了网络,通过下面的命令安装即可
[root@od ~]# yum install ruby -y
源码方式,如果没配置网络,可以通过源码安装
通git工具clone innodb_ruby[root@od ~]# wget https://cache.ruby-lang.org/pub/ruby/2.5/ruby-2.5.3.tar.gz [root@od ~]# ls -l ruby-2.5.3.tar.gz -rw-r--r--. 1 root root 15972577 Nov 17 00:10 ruby-2.5.3.tar.gz [root@od ~]# tar zxvf ruby-2.5.3.tar.gz [root@od ~]# cd ruby-2.5.3 [root@od ~]# ./configure [root@od ~]# make && make install
[root@od ~]# git clone https://github.com/jeremycole/innodb_ruby.git
Initialized empty Git repository in /root/innodb_ruby/.git/
remote: Enumerating objects: 2663, done.
Receiving objects: 20% (537/2663), 3.75 MiB | 7 KiB/s
remote: Total 2663 (delta 0), reused 0 (delta 0), pack-reused 2663
Receiving objects: 100% (2663/2663), 16.79 MiB | 9 KiB/s, done.
Resolving deltas: 100% (1451/1451), done.
[root@od ~]#
启用innodb_file_per_table,然后通过下面的语句造一些数据
#!/usr/bin/env ruby require "mysql" m = Mysql.new("127.0.0.1", "root", "", "test") m.query("DROP TABLE IF EXISTS t") m.query("CREATE TABLE t (i INT UNSIGNED NOT NULL, PRIMARY KEY(i)) ENGINE=InnoDB") (1..1000000).to_a.shuffle.each_with_index do |i, index| m.query("INSERT INTO t (i) VALUES (#{i})") puts "Inserted #{index} rows..." if index % 10000 == 0 end通过下面的使用手册,开启你的探索之旅
https://github.com/jeremycole/innodb_ruby/wiki
https://blog.jcole.us/2013/01/03/a-quick-introduction-to-innodb-ruby/
关于InnoDB结构的图片和资料,请参考
https://github.com/jeremycole/innodb_diagrams/tree/master/images
国内的MySQL大师也写过,比如何登成
https://github.com/hedengcheng/tech
Reference
https://blog.jcole.us/innodb/
https://github.com/hedengcheng/tech
http://www.ruby-lang.org/zh_cn/documentation/installation/
https://rubygems.org/gems/innodb_ruby/
https://dev.mysql.com/doc/internals/en/innodb-page-example.html
https://www.percona.com/blog/2014/03/05/engineer-duo-google-linkedin-join-innodb-talks/
https://www.percona.com/blog/2017/04/10/innodb-page-merging-and-page-splitting/