Posts Tagged ‘Hardware’

Import performance numbers from a real-world DC-X installation

Posted in February 25, 201012:54hTim StrehleNo Comments »

Being a few months into a medium-sized DC-X installation, I’d like to share a few real-world numbers regarding image and text import speed. During mass import runs, the system had a relatively high load but was still usable. I’m quite happy with the performance so far:

  • 50,000 images imported per hour (off-the-shelf DC-X importer); includes generation of preview images
  • 400,000 text articles (XML) imported per hour (minor performance tweaks needed, 8 parallel processes)
  • 800,000 documents indexed per hour by the Solr full-text search server

The total number of documents in that DC-X instance is currently 3.2 million, with the data taking up 44 GB in MySQL and 31 GB in Solr (plus the actual image, PDF and other files). A full optimization run of the Solr index takes 25 minutes.

The servers DC-X is running on (set up by Janz):

  • Three IBM System x3650 M2, each with:
  • two quad-core Intel Nehalem processors (Xeon X5570 @ 2.93GHz/1333MHz/8MB L3)
  • 48 GB RAM

The first server is running MySQL and Apache, the second one Solr and regular import processes and Apache, the third one Apache plus occasional mass import processes. Storage being used:

Software:

DC-X Hardware, Software and Space Requirements

Posted in February 18, 200913:58hawidhani1 Comment »

You might wonder what kind of system you would need to run DC-X.

Software Requirements – Operating System

Linux – Debian 4.0, Red Hat Enterprise Linux 5, SuSE Enterprise Linux 10, Ubuntu 8.04 are supported (if you are running Oracle, please remember that they insist on SLES or RHEL).

We are not yet supporting Solaris, but this is definitely on our list … our focus will be Solaris 10, I don’t think will ever support any prior releases.

We have no plans as of today for supporting other UNIX variants. The main reason is that we rely on 3rd-party open source software, that is sometimes not available or not as well-supported on other UNIX variants as on Linux or Solaris.

Software Requirements – Database

You will need either MySQL 5.1 (Enterprise) or Oracle 10gR2 or later (this includes 11g). For Oracle, Standard Edition will be ok for us … so Enterprise or Partitioning option is neither required nor would the application reap any benefit. You might of course prefer Enterprise Edition from a DBA perspective or the enhanced scaling options.

As mentioned elsewhere, we no longer use Oracle Text. For Standard Edition remember that you can usually go for the much cheaper Standard Edition One, if your server has no more than two CPU sockets. For most installations, using two-socket quad-core servers should be the most economical choice, anyway.

Hardware Requirements

Hardware requirements depend on the number of documents, users etc.

We recommend entry level, two-socket servers like those from Dell (Dell PowerEdge 29xx), HP (ProLiant DL3xx) or IBM (x series) to give you some examples. This is what most of our recent DC5 installations use. 8 GB should be the absolute minimum as database, fulltext search (Solr) and PHP caching greatly benefits from a lot of memory – and memory is cheap these days. Most installations should be well covered with all components on one or spread on two servers, but we also recommend a standby-server.

In case you’re interested: Our demo server (http://dcx.digicol.de) is a Dell PowerEdge 2900 III, 2 Quad-Core Xeon E5410 2.33GHz, 32 GB RAM and a couple of 146 GB SAS 15k disks. MySQL, Apache, Solr and PHP processes all run on one server.

Space Requirements

Again this depends on your specific requirements and data, but here are some figures from our demo system that has a good mix of text, images and videos (but hardly any PDF documents and no text extraction yet).

  • Database: 10 GB / 100.000 documents
  • Text index: 4 GB / 100.000 documents
  • Files: 100 GB / 100.000 documents

This sums up to 114 GB for 100.000 documents.