MySQL Cluster写入效率测试
MySQL Cluster使用到目前为止遇到渴望得到答案的问题,也是直接影响使用的问题就是MySQL Cluster的写入效率问题和Cluster是否适合大数据存储、如何配置存储的问题。
在之前的测试中MySQL Cluster的写入效率一直不佳,这也是直接影响能否使用MySQL Cluster的关键。现在我们来仔细测试一下。使用的环境略有变化。
Data节点的内存扩展为4G。
集群配置如下:
[ndbd default] # Options affecting ndbd processes on all data nodes: NoOfReplicas=2 # Number of replicas DataMemory=2000M # How much memory to allocate for data storage IndexMemory=300M # How much memory to allocate for index storage # For DataMemory and IndexMemory, we have used the # default values. Since the "world" database takes up # only about 500KB, this should be more than enough for # this example Cluster setup. MaxNoOfConcurrentOperations=1200000 MaxNoOfLocalOperations=1320000
测试代码如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
/**
* 向数据库中插入数据
*
* @param conn
* @param totalRowCount
* @param perRowCount
* @param tableName
* @author lihzh(OneCoder)
* @throws SQLException
* @date 2013 -1 -17 下午1:57:10
*/
private void insertDataToTable(Connection conn, String tableName,
long totalRowCount, long perRowCount, long startIndex)
throws SQLException {
conn.setAutoCommit( false);
String sql = "insert into " + tableName + " VALUES(?,?,?)";
System. out.println( "Begin to prepare statement.");
PreparedStatement statement = conn.prepareStatement(sql);
long sum = 0L;
for ( int j = 0; j < TOTAL_ROW_COUNT / BATCH_ROW_COUNT; j++) {
long batchStart = System. currentTimeMillis();
for ( int i = 0; i < BATCH_ROW_COUNT; i++) {
long id = j * BATCH_ROW_COUNT + i + startIndex;
String name_pre = String. valueOf(id);
statement.setLong(1, id);
statement.setString(2, name_pre);
statement.setString(3, name_pre);
statement.addBatch();
}
System. out.println( "It's up to batch count: " + BATCH_ROW_COUNT);
statement.executeBatch();
conn.commit();
long batchEnd = System. currentTimeMillis();
long cost = batchEnd - batchStart;
System. out.println( "Batch data commit finished. Time cost: "
+ cost);
sum += cost;
}
System. out.println( "All data insert finished. Total time cost: "
+ sum);
System. out.println( "Avg cost: "
+ sum/5);
}
分下列情景进行写入测试。
数据加载、写入在内存中时,在独立的新库、新表中一次写入100,1000,10000,50000条记录,分别记录其耗时情况。(5次平均)
100:260ms
1000:1940ms
10000:17683ms(12000-17000)
50000: 93308、94730、90162、94849、162848
与普通单点MySQL写入效率进行对比(2G内存)
100:182ms
1000:1624ms
10000:14946ms
50000:84438ms
双线程并发写入测试
由于只有两个SQL节点,所以这里只采用双线程写入的方法进行测试。代码上采用了简单的硬编码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
/**
* 多线程并行写入测试
*
* @author lihzh(OneCoder)
* @blog http://www.coderli.com
* @date 2013 -2 -27 下午3:39:56
*/
private void parallelInsert() {
final long start = System. currentTimeMillis();
Thread t1 = new Thread( new Runnable() {
@Override
public void run() {
try {
Connection conn = getConnection(DB_IPADDRESS, DB_PORT,
DB_NAME, DB_USER, DB_PASSOWRD);
MySQLClusterDataMachine dataMachine = new MySQLClusterDataMachine();
dataMachine.insertDataToTable(conn, TABLE_NAME_DATAHOUSE,
500, 100, 0);
long end1 = System.currentTimeMillis();
System. out.println( "Thread 1 cost: " + (end1 - start));
} catch (SQLException e) {
e.printStackTrace();
}
}
});
Thread t2 = new Thread( new Runnable() {
@Override
public void run() {
try {
Connection conn = getConnection(DB_IPADDRESS_TWO, DB_PORT,
DB_NAME, DB_USER, DB_PASSOWRD);
MySQLClusterDataMachine dataMachine = new MySQLClusterDataMachine();
dataMachine.insertDataToTable(conn, TABLE_NAME_DATAHOUSE,
500, 100, 500);
long end2 = System.currentTimeMillis();
System. out.println( "Thread 2 cost: " + (end2 - start));
} catch (SQLException e) {
e.printStackTrace();
}
}
});
t1.start();
t2.start();
}
测试结果:
(总条数/每次) | 线程1(总/平均- 各写一半数据) | 线程2 | 并行总耗时 | 单线程单点 |
1000/100 | 985/197 | 1005/201 | 1005/201 | 2264/226 |
10000/1000 | 9223/1836 | 9297/1850 | 9297/1850 | 19405/1940 |
100000/10000 | 121425/12136 | 122081/12201 | 121425/12136 | 148518/14851 |
从结果可以看出,在10000条以下批量写入的情况下,SQL节点的处理能力是集群的瓶颈,双线程双SQL写入相较单线程单节点效率可提升一倍。但是当批量写入数据达到一定数量级,这种效率的提升就不那么明显了,应该是集群中的其他位置也产生了瓶颈。
注:由于各自测试环境的差异,测试数据仅可做内部比较,不可外部横向对比。仅供参考。
写入测试,要做的还很多,不过暂时告一段落。大数据存储和查询测试,随后进行。
本文由作者按照 CC BY 4.0 进行授权