Archive for the 'Database Management Systems' Category

MySQL: Twelve Days of Scaleout - reprise

As a follow up to my entry MySQL: Twelve Days of Scaleout, I listened to the webinar:
Scale-Out & Replication Best Practices for High-Growth Businesses. This one hour webinar gives a brief overview of different replication approaches to support scaleout. It was a good introduction. The slides are also available (slides). On many of the pages there are references to white papers or other pages that elaborate on the topic at hand.

The following is the agenda for the webinar:

• Introduction to MySQL
• Scale Out Overview
• Replication Fundamentals
• Statement vs. Row-Based Replication
• Replication Topologies
• Other Replication Strategies
• Case Studies
• Additional Resources Conclusion

The companies highlighted in the feature 12 Days of Scaleout used a variety of the techniques highlighted in this webinar. The techiques each have their unique advantages and disadvantages. For a complete solution you will likely need to use DRBD (Distributed Replicated Block Device) with Linux Heartbeats to have complete fail over and manage inconsistencies in the case of failure, utilize application partitioning (also called shards) to manage which ‘partition’ of the database to access, and add simpler replication to provide read-only copies for further scalability. There is no one size fits all - it depends on the data, and the application. But MySql has a wealth of techniques (some just becoming available in the lastest release - 5.1 - which is still designated as beta on most platforms) - and you should be able to find an appropriate combination them for your application.

One last link - Top Five Scale-Out Pitfalls to Avoid. This offers some food for thought in approaching your scaleout solution. Some of this I found familiar - but the rapid scale that can be required in a Web 2.0 application requires planning and architecting from the beginning - and these tips will be helpful.

MySQL: Twelve Days of Scaleout

Tim O’Reilly wrote today about MySQL: Twelve Days of Scaleout in O’Reilly Radar. For twelve days, MySQL will highlight a different company that is using MySQL to scale-out their business. Today’s highlighted company is Wikipedia.

One of the distinctions that MySQL is making with this series of articles is scale-out vs scale-up. They define scale-out as improving performance and scalability incrementally on commodity hardware. Scale-up is the process of making a large up-front investment in more complex and expensive hardware and database licenses.

The examples are compelling. Wikipedia handles 25,000 requests a second with 20 replicated MySQL servers. They also deploy additional MySQL servers on application servers. With (as an example) Oracle - you wouldn’t even think about deploying the additional servers - because of license costs.

Other highlighted companies include: Zimbra - which supports 8 million paid mailboxes across 10s of thousands of organizations; Alcatel-Lucent who has deployed an application that supports 50 million users generating up to 1000 transaction per second at peak times; Gumtree.com which supports over 1 million ads at a time, and has deployed 10 MySQL servers to support its service.

If you are deploying expensive database products, it is worth reviewing these case studies, looking at the white papers and listening in to the the webinar on June 20th.

1 Comment »

Jim on June 15th 2007 in Database Management Systems

New companies viewed at Launch Silicon Valley sponsored by SVASE

Yesterday I attended Launch Silicon Valley sponsored by SVASE (Silicon Valley Association of Entrepeneurs). It was a sort of beauty contest, with about 30 companies making brief presentations (plus a three interesting keynotes - each very different - but each with some interesting insights). It was a particularly eclectic set of companies in very different fields. They were also different stages, some with cool ideas and technology but no apparent business model, or a business model, but no idea of how to market it. But to be clear, there were were several that are fully formed, and several with good technology concepts that need just a little more thought to get them positioned in the right space. Here are notes on a few of the companies that caught my eye.

Datamash Corporation allows you to share data in spreadsheets (with more sources and targets in the works). For a spreadsheet - this means that I can have a spreadsheet - which rolls up data from other people. As they update their spreadsheets, the data is reflected in my spreadsheet the next time I open it. For a sales manager, this is quite powerful. This creates an interesting database management system, and I can picture a number of issues that need to be explored - but it is a powerful concept.

Jaxtr is one of several services that connect callers - reducing their telephone costs (some use a local phone number), and connecting you without your caller knowing your phone number. Others include Jajah, Jangl, and, from Panttaja Consulting alumni Mahesh Lalwani, CCube. Different capabilities (depending on the company) include the ability to block specific calls, send some to voice mail, and limit what time calls are allowed.

ShapeWriter is a text input product that is based on gestures over a diagram of keys. You draw on the diagram going from letter to letter. You are able to learn the gesture for a given word - and if you are closes enough, the system figures out what word you had in mind. In seems especially timely with the announcement from Microsoft of Surface, and also seems like it would be useful with cell phones (especially iPhone).

TelID was a particularly simple, but compelling idea. It provides an alternative to local search based on a phone number. They are initially working with Yellow Pages companies - to include ads that indicate that there is a TelID link available. You then enter the phone number as part of the URL: www.telid.com/4155551212, and you get a web page about the company (or individual - allowing an insurance agent to have their own page - easily accessible on the web). This takes an existing, well known identifier - the phone number, as the search tool. Clever idea, and clever to link it to Yellow Pages as a first attempt to get critical mass.

Mary has also blogged on this event at mary.panttaja.com.

Database Management Systems - Occasionally Connected Applications

In the last week I have been looking at a couple of new development tools that support the occasionally connected user. I am particularly interested in these - because I am one of those users. When I am in my car, or on a plane, I still want to be able to do my work. But if I am using applications that require connectivity, I may be in trouble (I do have a portable router and cellular communication for my car - but I am not always in range).

Two of hese tools are, Apollo, and Google Gears. Both provide the ability to work with a local database on the client on the client machine when you are not connected. But neither address the design patterns required to manage synchronization between the client and the server.

A couple of examples of the design patterns are:

  • An RSS feed client - that just views the current entries. When you next sync, overwrite any data on the client with the current data from the server.
  • A data entry application - that allows users to add or edit data. But for a given data element, there is only one authorized writer/updater. When you next sync, overwrite any data on the server with the new client data. There are still potential conflicts - for instance if the user syncs from a different session (typically machine) and the first machine hadn’t yet done its sync. Any individual sync without apparent conflict is allowed. But updating an already deleted row will obviously be an issue.

The examples get more complicated - and have to be thought through carefully for a given application. These tools are making it too easy for people to walk in to synchronization issues in application implementation.