Amazon EC2 has built the Spot Instance Marketplace and offers a new type of virtual machine instances called as spot instances. These instances are less expensive but considered failure-prone. Despite the underlying hardware status, if the bidding price is lower than the market price, such an instance will be terminated.
Distributed systems can be built from the spot instances to reduce the cost while still tolerating instance failures. For example, embarrassingly parallel jobs can use the spot instances by re-executing failed tasks. The bidding framework for such jobs simply selects the spot price as the bid. However, highly available services like lock service or storage service cannot use the similar techniques for availability consideration. The spot instance failure model is different to that of normal instances (fixed failure probability in traditional distributed model). This makes the bidding strategy more complex to keep service availability for such systems.
We formalize this problem and propose an availability and cost aware bidding framework. Experiment results show that our bidding framework can reduce the costs of a distributed lock service and a distributed storage service by 81.23% and 85.32% respectively while still keeping availability level the same as it is by using on-demand instances.