Memcached (Mem-Cache-D) is a general-purpose distributed memory caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source (such as a database or API) must be read.
Memcached is free and open-source software, licensed under the Revised BSD license.[2] Memcached runs on Unix-like operating systems (at least Linux and OS X) and on Microsoft Windows. It depends on the libevent library.
Memcached's APIs provide a very large hash table distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in least recently used (LRU) order.[3][4] Applications using Memcached typically layer requests and additions into RAM before falling back on a slower backing store, such as a database.
The size of this hash table is often very large. It is limited to available memory across all the servers in the cluster of servers in a data centre. Where high volume, wide audience web publishing requires it, this may stretch to many gigabytes. Memcached can be equally valuable for situations where either the number of requests for content is high, or the cost of generating a particular piece of content is high.
Memcached was originally developed by Danga Interactive for LiveJournal, but is now used by many other systems, including MocoSpace,[5] YouTube,[6] Reddit,[7] Survata,[8] Zynga,[9] Facebook,[10][11][12] Orange,[13] Twitter,[14] Tumblr[15] and Wikipedia.[16] Engine Yard and Jelastic are using Memcached as part of their platform as a service technology stack[17][18] and Heroku offers several Memcached services[19] as part of their platform as a service. Google App Engine, AppScale, Microsoft Azure and Amazon Web Services also offer a Memcached service through an API.[20][21][22][23]
A database shard is a horizontal partition of data in a database or search engine. Each individual partition is referred to as a shard or database shard. Each shard is held on a separate database server instance, to spread load.
Some data within a database remains present in all shards,[notes 1] but some only appears in a single shard. Each shard (or server) acts as the single source for this subset of data.
Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than being split into columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.
There are numerous advantages to the horizontal partitioning approach. Since the tables are divided and distributed into multiple servers, the total number of rows in each table in each database is reduced. This reduces index size, which generally improves search performance. A database shard can be placed on separate hardware, and multiple shards can be placed on multiple machines. This enables a distribution of the database over a large number of machines, which means that the load can be spread out over multiple machines, greatly improving performance. In addition, if the database shard is based on some real-world segmentation of the data (e.g., European customers v. American customers) then it may be possible to infer the appropriate shard membership easily and automatically, and query only the relevant shard.[2] Disadvantages include :
Memcached is free and open-source software, licensed under the Revised BSD license.[2] Memcached runs on Unix-like operating systems (at least Linux and OS X) and on Microsoft Windows. It depends on the libevent library.
Memcached's APIs provide a very large hash table distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in least recently used (LRU) order.[3][4] Applications using Memcached typically layer requests and additions into RAM before falling back on a slower backing store, such as a database.
The size of this hash table is often very large. It is limited to available memory across all the servers in the cluster of servers in a data centre. Where high volume, wide audience web publishing requires it, this may stretch to many gigabytes. Memcached can be equally valuable for situations where either the number of requests for content is high, or the cost of generating a particular piece of content is high.
Memcached was originally developed by Danga Interactive for LiveJournal, but is now used by many other systems, including MocoSpace,[5] YouTube,[6] Reddit,[7] Survata,[8] Zynga,[9] Facebook,[10][11][12] Orange,[13] Twitter,[14] Tumblr[15] and Wikipedia.[16] Engine Yard and Jelastic are using Memcached as part of their platform as a service technology stack[17][18] and Heroku offers several Memcached services[19] as part of their platform as a service. Google App Engine, AppScale, Microsoft Azure and Amazon Web Services also offer a Memcached service through an API.[20][21][22][23]
A database shard is a horizontal partition of data in a database or search engine. Each individual partition is referred to as a shard or database shard. Each shard is held on a separate database server instance, to spread load.
Some data within a database remains present in all shards,[notes 1] but some only appears in a single shard. Each shard (or server) acts as the single source for this subset of data.
Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than being split into columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.
There are numerous advantages to the horizontal partitioning approach. Since the tables are divided and distributed into multiple servers, the total number of rows in each table in each database is reduced. This reduces index size, which generally improves search performance. A database shard can be placed on separate hardware, and multiple shards can be placed on multiple machines. This enables a distribution of the database over a large number of machines, which means that the load can be spread out over multiple machines, greatly improving performance. In addition, if the database shard is based on some real-world segmentation of the data (e.g., European customers v. American customers) then it may be possible to infer the appropriate shard membership easily and automatically, and query only the relevant shard.[2] Disadvantages include :
- A heavier reliance on the interconnect between servers[citation needed]
- Increased latency when querying, especially where more than one shard must be searched.[citation needed]
- Data or indexes are often only sharded one way, so that some searches are optimal, and others are slow or impossible.[clarification needed]
- Issues of consistency and durability due to the more complex failure modes of a set of servers, which often result in systems making no guarantees about cross-shard consistency or durability.
No comments:
Post a Comment