Understanding Safe Database Synchronization
Data or content synchronization is one of the classic problems in software world. It becomes very trivial point while working on software production, where production data and schema needs to be synchronized with live data and schema. Having some basic conceptual complexity, developers often get afraid to use any automated tool considering the risk factors to lose of data or content. Mostly, in these cases a manual process has to involve making sure a safe content synchronization. However as human is also error prone, there still exists risk factor to lose of content, but also includes a huge human time and effort on it. Having a clear specific idea on content synchronization will greatly help to reduce such overheads. Although, the synchronization concept exists in disk space, network, database etc sectors, today we’ll basically focus on database synchronization concept, which will also help to understand synchronization concept from a generic point of view.
What is synchronization?
So, what is synchronization? This is a process that ensures the same content among two participating entities, having different set of content possibly.
For instance, a database table, named Employee, which has two instance on two different databases, and exact same schema definition, after a synchronization process, both table will contact the identical number of data rows and column values.
In a synchronization process, there involves two participants, generally, termed as source and destination, where the content will be placed from source entity to destination entity.
Based on requirement and characteristics of data, the synchronization process can be categorized in two ways
- Unidirectional synchronization: replacing destination entity with the source entity
In a unidirectional synchronization, all of the contents from source entity will be placed to destination entity, which also implies, any content in the destination entity, that doesn’t exist in source entity, will be deleted. - Bidirectional synchronization: merging data from both participating entities
Before understanding both synchronization processes clearly, let’s consider three sample states of data entities:
a) Initial state: where both source and destination entity contains exact same number of records and column values.
b) Data change state: the state where data get changed in both source and destination entity.
c) The synchronized state: where data has been synchronized among source and destination state.
In a unidirectional synchronization, all of the contents from source entity will be placed to destination entity, which also implies, any content in the destination entity, that doesn’t exist in source entity, will be deleted.
There is a high degree of data lost risk factors in unidirectional synchronization, as all of the data contents will be deleted in the destination entity, which don’t exist in source entity. In the above sample, #2 and #5 row item has been deleted due to data synchronization process. So, database administrators need to be cautious to confirm that if this data lose is expected.
In bidirectional synchronization, all of the rows and column values in source and destination entity will be merged data from both participating entities.
Thus, on bidirectional synchronization, no data will be deleted neither in source or destination entity during the synchronization process. However the only data lose risk factors in bidirectional synchronization can be considered when same data row (identified by primary key) that has been modified in source entity, get replaced in the destination entity.
In the above sample, #1 row has been updated in destination entity, from ‘Ashraf’ to ‘Ashraful’. So, database administrators need to be cautious to confirm that if this data replacement is expected.