<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Digital Collections Blog &#187; Performance</title>
	<atom:link href="http://blog.digicol.de/tag/performance/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.digicol.de</link>
	<description>News for our customers, partners and friends</description>
	<lastBuildDate>Mon, 21 Nov 2011 11:49:47 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2</generator>
		<item>
		<title>Import performance numbers from a real-world DC-X installation</title>
		<link>http://blog.digicol.de/2010/02/25/import-performance-from-a-real-world-dc-x-installation/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=import-performance-from-a-real-world-dc-x-installation</link>
		<comments>http://blog.digicol.de/2010/02/25/import-performance-from-a-real-world-dc-x-installation/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 11:54:11 +0000</pubDate>
		<dc:creator>Tim Strehle</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[DC-X]]></category>
		<category><![CDATA[Hardware]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://blog.digicol.de/?p=224</guid>
		<description><![CDATA[Being a few months into a medium-sized DC-X installation, I&#8217;d like to share a few real-world numbers regarding image and text import speed. During mass import runs, the system had a relatively high load but was still usable. I&#8217;m quite happy with the performance so far: 50,000 images imported per hour (off-the-shelf DC-X importer); includes [...]]]></description>
			<content:encoded><![CDATA[<p>Being a few months into a medium-sized DC-X installation, I&#8217;d like to share a few real-world numbers regarding image and text import speed. During mass import runs, the system had a relatively high load but was still usable. I&#8217;m quite happy with the performance so far:</p>
<ul>
<li><strong>50,000 images</strong> imported <strong>per hour</strong> (off-the-shelf DC-X importer); includes generation of preview images</li>
<li><strong>400,000 text articles</strong> (XML) imported <strong>per hour</strong> (minor performance tweaks needed, 8 parallel processes)</li>
<li><strong>800,000 documents indexed</strong> per hour by the <a href="http://lucene.apache.org/solr/">Solr</a> full-text search server</li>
</ul>
<p>The total number of documents in that DC-X instance is currently 3.2 million, with the data taking up 44 GB in MySQL and 31 GB in Solr (plus the actual image, PDF and other files). A full optimization run of the Solr index takes 25 minutes.</p>
<p>The servers DC-X is running on (set up by <a href="http://www.janz.de/is/">Janz</a>):</p>
<ul>
<li>Three IBM System <a href="http://www-03.ibm.com/systems/x/hardware/rack/x3650m2/index.html">x3650 M2</a>, each with:</li>
<li>two quad-core Intel Nehalem processors (Xeon X5570 @ 2.93GHz/1333MHz/8MB L3)</li>
<li>48 GB RAM</li>
</ul>
<p>The first server is running MySQL and Apache, the second one Solr and regular import processes and Apache, the third one Apache plus occasional mass import processes. Storage being used:</p>
<ul>
<li>IBM System Storage <a href="http://www-03.ibm.com/systems/storage/disk/ds3000/ds3400/index.html">DS3400</a> and <a href="http://www-03.ibm.com/systems/storage/disk/exp3000/index.html">EXP3000</a></li>
<li>shared using the IBM <a href="http://www-03.ibm.com/systems/software/gpfs/index.html">General Parallel File System</a> (GPFS)</li>
</ul>
<p>Software:</p>
<ul>
<li><a href="http://www.novell.com/products/server/">SUSE Linux Enterprise Server</a> 10</li>
<li><a href="http://lucene.apache.org/solr/">Solr</a> 1.4</li>
<li><a href="http://dev.mysql.com/downloads/mysql/">MySQL Community Server</a> 5.1.40</li>
<li><a href="http://www.php.net/">PHP</a> 5.3.1, <a href="http://httpd.apache.org/">Apache</a> 2</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.digicol.de/2010/02/25/import-performance-from-a-real-world-dc-x-installation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

