Partitioning
Partitioning is dividing up data in the database into distinct independent elements, partitioning serves three main purposes
There are are two ways to partition database tables
Horizontal partitioning | different rows are stored in different tables, for example customers with ID's less than 100,000 are stored in customers1 table and customers with ID's greater are stored in customers2 table, etc. In horizontal partitioning the table schemas are exactly the same (i.e.customer1 and customer2 tables have the same schema), as if the table was cut into parts using a horizontal line. Archiving older data is commonly used example of horizontal partitioning, another technique using horizontal partitioning is sharding which involves using separate servers to host similar types of data, for example customer1 table would be held on server1 and customer2 table would be held on server2. Horizontal partitioning is achieved manually using merge tables, as merge tables do not automatically assign data to different underlying tables. Internal partitioning assigns records to a partition based on certain criteria. |
Vertical partitioning | different fields are stored in different tables, for example customer names and email addresses might be stored in Customers and Addresses tables. Often normalization involves vertical partitioning, which means splitting up otherwise normalized data. In vertical partitioning the tables are split up vertically which means that the schemas will be different but they contain similar records. This type of partitioning is also know as row splitting because each row is being split up. Different physical storage could be used as well, you could even store pictures and large documents on the file system instead of in the database. another option is to split up data in dynamic and static data tables, a static data table could then use query cache to improve performance. |
I have a Oracle section regarding partitioning tables, MySQL offers the following
range partitioning | used for data that can be separated into ranges based on some criterion i.e. date, part number, ID's. |
list partitioning | used for data that can be separated into lists based on some criterion i.e. city, territory. |
hash partitioning | if data in a range partition is not evenly distributed (lots of data for one particular date) then performance decreases, hash partitioning uses hashing algorithms to assign a hash value to each rows partitioning key and place it in the appropriate partition. Hopefully the data will be evenly distributed across the partitions. |
key partitioning | An internal algorithm is used by mysqld to try and evenly distribute the data across the partitions, a field called the key column is used for the determination of data |
Range and list partition can be subpartitioned using hash or key partitioning, this is known as composite partitioning. Each of the partitioning algorithms utilizes a partition key in which partition the data is stored. This partition key is defined when creating the partition and must be an integer. If the table has a primary key it must include all the fields in the partition key, if the table has no primary key but has a unique key it must include all the fields in the partition key, however a table can use partitioning if it has no primary or unique key specified.
Lets see a few examples
range partitioning | ## Creating a new partition table create table employees ( ## Alter a table to add partitioning |
list partitioning | create table employee_by_region ( emp_id int not null, f_name varchar(30) not null, l_name varchar(30) not null, store_id int not null ) partition by list (store_id) ( partition north values in (1,2,3,4,5), partition east values in (6,7,8,9,10), partition south values in (11, 12, 13, 14, 15), partition east values in (16, 17, 18, 19, 20) ); |
hash partitioning | ## here we are creating 5 hash partitions create table employee_by_region ( emp_id int not null, f_name varchar(30) not null, l_name varchar(30) not null, store_id int not null ) partition by hash (store_id) partitions 5; |
key partitioning | ## here we are creating 5 key partitions create table employee_by_region ( emp_id int not null, f_name varchar(30) not null, l_name varchar(30) not null, store_id int not null ) partition by key (store_id) partitions 5; |
composite partitioning | create table invoices ( |
There are a number of partition management commands
add partition | alter table employees_by_region add partition ( partition england values in (1,2), partition scotland values in (3,4) ); |
drop partition | alter table employees_by_region drop partition scotland; |
coalesce partition | alter table employees_hash coalesce partition 4; |
reorganize partition | alter table employees repair partition p0, p1; |
analyze partition | alter table employees analyze partition 3; |
check partition | alter table employees check partition 3; |
optimize partition | alter table employees optimize partition p0, p1; |
rebuild partition | alter table employees rebuild partition p0, p3; |
I have already mentioned these maintenance commands in my data types section.
A merge table is a table structure with no data, which specifies a set of identical structured MyISAM tables. All tables must have exactly the same indexes and fields in the same order, it also specifies where new data should be inserted but it does not have automatic data partitioning
Here is an example
merge table | ## create three identical tables create table employees_3 ( |
When you need to change a merge table you have two options, the first is to drop the merge table and recreate it with a new list of underlying tables, the second is to use the alter table union=(table_one,table-two...) to change which tables are the underlying tables.
Merge tables offer the following advantages