Sunday, 19 July 2015

Memcached and Shard

Memcached (Mem-Cache-D) is a general-purpose distributed memory caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source (such as a database or API) must be read.
Memcached is free and open-source software, licensed under the Revised BSD license.[2] Memcached runs on Unix-like operating systems (at least Linux and OS X) and on Microsoft Windows. It depends on the libevent library.
Memcached's APIs provide a very large hash table distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in least recently used (LRU) order.[3][4] Applications using Memcached typically layer requests and additions into RAM before falling back on a slower backing store, such as a database.
The size of this hash table is often very large. It is limited to available memory across all the servers in the cluster of servers in a data centre. Where high volume, wide audience web publishing requires it, this may stretch to many gigabytes. Memcached can be equally valuable for situations where either the number of requests for content is high, or the cost of generating a particular piece of content is high.
Memcached was originally developed by Danga Interactive for LiveJournal, but is now used by many other systems, including MocoSpace,[5] YouTube,[6] Reddit,[7] Survata,[8] Zynga,[9] Facebook,[10][11][12] Orange,[13] Twitter,[14] Tumblr[15] and Wikipedia.[16] Engine Yard and Jelastic are using Memcached as part of their platform as a service technology stack[17][18] and Heroku offers several Memcached services[19] as part of their platform as a service. Google App Engine, AppScale, Microsoft Azure and Amazon Web Services also offer a Memcached service through an API.[20][21][22][23]


A database shard is a horizontal partition of data in a database or search engine. Each individual partition is referred to as a shard or database shard. Each shard is held on a separate database server instance, to spread load.
Some data within a database remains present in all shards,[notes 1] but some only appears in a single shard. Each shard (or server) acts as the single source for this subset of data.
Horizontal partitioning is a database design principle whereby rows of a database table are held separately, rather than being split into columns (which is what normalization and vertical partitioning do, to differing extents). Each partition forms part of a shard, which may in turn be located on a separate database server or physical location.
There are numerous advantages to the horizontal partitioning approach. Since the tables are divided and distributed into multiple servers, the total number of rows in each table in each database is reduced. This reduces index size, which generally improves search performance. A database shard can be placed on separate hardware, and multiple shards can be placed on multiple machines. This enables a distribution of the database over a large number of machines, which means that the load can be spread out over multiple machines, greatly improving performance. In addition, if the database shard is based on some real-world segmentation of the data (e.g., European customers v. American customers) then it may be possible to infer the appropriate shard membership easily and automatically, and query only the relevant shard.[2] Disadvantages include :
  • A heavier reliance on the interconnect between servers[citation needed]
  • Increased latency when querying, especially where more than one shard must be searched.[citation needed]
    • Data or indexes are often only sharded one way, so that some searches are optimal, and others are slow or impossible.[clarification needed]
  • Issues of consistency and durability due to the more complex failure modes of a set of servers, which often result in systems making no guarantees about cross-shard consistency or durability.

Sunday, 5 July 2015

Tail Recursion/Call

Tail calls can be implemented without adding a new stack frame to the call stack. Most of the frame of the current procedure is not needed any more, and it can be replaced by the frame of the tail call, modified as appropriate (similar to overlay for processes, but for function calls). The program can then jump to the called subroutine. Producing such code instead of a standard call sequence is called tail call elimination.
When a function is called, the computer must "remember" the place it was called from, the return address, so that it can return to that location with the result once the call is complete. Typically, this information is saved on the call stack, a simple list of return locations in order of the times that the call locations they describe were reached. For tail calls, there is no need to remember the place we are calling from – instead, we can perform tail call elimination by leaving the stack alone (except possibly for function arguments and local variables[1]), and the newly called function will return its result directly to the original caller. Note that the tail call doesn't have to appear lexically after all other statements in the source code; it is only important that the calling function return immediately after the tail call, returning the tail call's result if any, since the calling function will never get a chance to do anything after the call if the optimization is performed.

A tail call can be located just before the syntactical end of a subroutine:
function foo(data) {
    a(data);
    return b(data);
}
Here, both a(data) and b(data) are calls, but b is the last thing the procedure executes before returning and is thus in tail position. However, not all tail calls are necessarily located at the syntactical end of a subroutine. Consider:
function bar(data) {
    if ( a(data) ) {
        return b(data);
    }
    return c(data);
}
Here, both calls to b and c are in tail position. This is because each of them lies in the end of if-branch respectively, even though the first one is not syntactically at the end of bar's body.

For more details visit https://en.wikipedia.org/wiki/Functional_programming

Thanks
Gaurav