Data replication is the act of copying data from one location or database to another. This may be done over any network available such as local area network (LAN), local wide-area network (WAN) or storage area network (SAN). The main reason for replication is to ensure prompt data recovery. This is because data loss may occur at the primary storage database. In such an event, replicated data is recovered from secondary and tertiary databases. Greenplum data replication is a good example of this continuous process.

Today, many businesses require excellent systems of storing data. This is because of large-scale automation and computerization. Companies generate a lot of data on a daily basis, which must be stored. However, while small business ventures only generate minimal data, large corporations, with huge amounts of information generated daily, require efficient replication systems.

Data Replication Strategies

It is vital to select the right strategy for this process. This is based on level of efficiency required as well as the sensitivity of data generated. There are two main types of replication strategies commonly used today. These are synchronous and asynchronous.

Synchronous Replication

This refers to the continuous maintenance of up to date data copies. It allows the writing of data to both primary and secondary sites simultaneously. It introduces latency that normally slows down primary applications. It only works within distances of 300 kilometers.

The process is instant and allows the updating of new data over multiple databases immediately it is created. It is the perfect strategy to use for purposes of disaster recovery. Synchronous replication is expensive because it requires the use of specialized hardware with more bandwidth. It is preferred by banking institutions due to the nature of transactions they carry out.

Asynchronous Replication

This process writes data more slowly as compared to synchronous replication. It replicates data onto the primary storage first and then copies it to targeted locations at scheduled intervals. This approach is less expensive as it requires less bandwidth. Another advantage is that it is designed to operate over long distances. Companies like Facebook use this approach to store their data. It is also used for Cloud backups.

Importance of this process

All data generated during normal business operations must be stored for future use. Network and social media companies like Facebook and Google, generate a lot of data daily that must all be stored for referencing and other legal purposes. Other business like financial institutions handle many transactions on a daily basis that must be well stored.