<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Embracing Chaos &#187; System Architecture</title>
	<atom:link href="http://www.embracingchaos.com/system-architecture/feed" rel="self" type="application/rss+xml" />
	<link>http://www.embracingchaos.com</link>
	<description>Analysis of Trends in Technology, Business, Society</description>
	<lastBuildDate>Mon, 23 Jan 2012 16:29:54 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>XMPP PubSub: a great compliment to Atom/RSS</title>
		<link>http://www.embracingchaos.com/2008/07/xmpp-pubsub-a-g.html</link>
		<comments>http://www.embracingchaos.com/2008/07/xmpp-pubsub-a-g.html#comments</comments>
		<pubDate>Tue, 22 Jul 2008 14:11:34 +0000</pubDate>
		<dc:creator>leodirac</dc:creator>
				<category><![CDATA[Geek]]></category>
		<category><![CDATA[Social Computing]]></category>
		<category><![CDATA[System Architecture]]></category>
		<category><![CDATA[XMPP]]></category>

		<guid isPermaLink="false">http://wp.embracingchaos.com/2008/07/xmpp-pubsub-a-g.html</guid>
		<description><![CDATA[I spent the day yesterday at XMPP Summit #5 alongside OSCON in Portland. It was a great chance to catch up with old friends and meet a few new ones. But my favorite part was the break-out discussion of XMPP PubSub as it relates to micro-blogging. We discussed what hopefully will emerge as a standard way to associate an existing Atom/RSS feed with an XMPP PubSub Node. First some background on the relevant technologies. Feel free to skip ahead if you understand this stuff. PubSub 101: Push vs Pull PubSub is short for "publish subscribe" which is a common design...
]]></description>
			<content:encoded><![CDATA[<p>I spent the day yesterday at <a href="http://www.xmpp.org/summit/summit5.shtml">XMPP Summit #5</a> alongside <a href="http://en.oreilly.com/oscon2008/public/content/home">OSCON</a> in Portland.&nbsp; It was a great chance to catch up with old friends and meet a few new ones.&nbsp; But my favorite part was the break-out discussion of XMPP PubSub as it relates to micro-blogging.&nbsp; We discussed what hopefully will emerge as <strong>a standard way to associate an existing Atom/RSS feed with an XMPP PubSub Node.</strong>&nbsp; First some background on the relevant technologies.&nbsp; Feel free to skip ahead if you understand this stuff.</p>
<h3>PubSub 101: Push vs Pull</h3>
<p>PubSub is short for &quot;publish subscribe&quot; which is a common design pattern describing a way to distribute information to interested parties.&nbsp; The publisher tells a server about new information, and the server fans the information out to everybody who has registered interest in that topic or channel.&nbsp; Data consumers find out about the new information very quickly, with relatively little load on the whole system, since the pubsub mechanism provides a means to &quot;push&quot; data to them.&nbsp; </p>
<p>By contrast, almost all of the web today follows uses a &quot;pull model&quot; where a data consumer only finds out about new information when it gets around to checking if there is something new.&nbsp; This data distribution model is significantly simpler because the server only needs to keep track of the content, not who is interested in knowing about it.&nbsp; Modern networks are optimized for this kind of query-based traffic where data consumers (clients, web browsers) initiate connections to servers, such that it&#8217;s often impossible for servers to initiate conncetions to clients because of firewalls or NAT.</p>
<p>The downside of the pull model is that the only way a data consumer can find out if thanything is new on the server is to &quot;check back frequently&quot; or &quot;poll&quot; the server for changes.&nbsp; If you want to know within 15 minutes if anything new has been posted, you have to ask the server at least every 15 minutes &quot;anything new?&quot;&nbsp; No.&nbsp; &quot;How about now?&quot;&nbsp; No.&nbsp; &quot;Got anything yet?&quot;&nbsp; No.&nbsp; Mulitply this by potentially millions of interested data consumers and you end up spending a lot of network bandwidth and server resources doing very little.&nbsp; Even worse, <strong>the problem scales horribly</strong>.&nbsp; If clients want to know about changes within 5 minutes instead of 15, that puts 3 times the load on the server.&nbsp; Want to know within a few seconds?&nbsp; Forget it &#8212; the servers would crash.&nbsp; There&#8217;s an intrinsic delay in distributing information in this model, and reducing that delay is very expensive.</p>
<h3>XMPP as an alternative to polling</h3>
<p>XMPP is the protocol used for Instant Messaging by Google Talk and Jabber and a large number of small servers.&nbsp; In order to deliver instant messages, XMPP systems maintain persistent connections between all machines that allow packets of data to be pushed with very low latency &#8212; IM messages are typically delivered within a second of sending them.&nbsp; So it&#8217;s natural to want to use this infrastructure to deliver other data more efficiently than through polling HTTP.</p>
<p>The XMPP PubSub spec known as <a href="http://www.xmpp.org/extensions/xep-0060.html">XEP-0060</a> describes how to do exactly this at the protocol level.&nbsp; But for a variety of reasons, this standard has not gained wide adoption.&nbsp; IMHO the biggest reason is that there isn&#8217;t a very pressing need.&nbsp; The current system is horribly inefficient, but it works.&nbsp; Moreover, it puts the burden of inefficiency squarely in the hands of the information publishers.&nbsp; Popular publishers are just expected to shell up for necessary hardware to meet the demands of their readers, and with advertising they can typically recoup the necessary investment.</p>
<p>Another way to state that is that pubsub hasn&#8217;t found its niche yet.&nbsp; IMHO this is partly because the mechanism is so useful it can be applied to almost anything.&nbsp; Not just breaking news, but everything from e-mail mailing lists to <a href="http://daubers.homelinux.net/2008/02/06/bluetooth-xmpp-doorbell/">doorbell chimes</a> get used as examples of how XMPP pubsub technology could be applied.&nbsp; Not wanting to exclude any of these potentially interesting uses, the protocol remains very generic.</p>
<h3>Micro-blogging, Atom and Yesterday&#8217;s Realization</h3>
<p>One place where the current HTTP model breaks down is micro-blogging, which is the generic term for services like <a href="http://www.twitter.com/">Twitter</a> or Facebook&#8217;s udpates.&nbsp; Here, the payload of actual content is very small, so the overhead of checking far outweighs the &quot;useful data&quot; which is delivered.&nbsp; Also, because the information is social (i.e. &quot;Heading to Broadway for a bite &#8212; wanna come?&quot;) consumers demand it be delivered quickly.&nbsp; Nonetheless, current micro-blogging services still rely on polling clients, and their servers suffer as a result.</p>
<p>Yesterday, a group of us including <a href="http://twitter.com/blaine">Blaine Cook</a>, <a href="http://anders.conbere.org/journal/">Anders Conbere</a>, <a href="http://evan.prodromou.name/">Evan Prodromou</a>, and XEP-0060 co-author <a href="http://ralphm.net/">Ralph Meijer</a> were discussing <strong>how to apply XMPP PubSub to micro-blogging</strong>.&nbsp; This was likely obvious to many there already, but during the discussion I had a realization.&nbsp; We aren&#8217;t solving this problem from whole cloth.&nbsp; <strong>RSS and Atom feeds already describe all the information we need</strong>.&nbsp; We just need to find a way to substitute XMPP for the assumed transport HTTP.</p>
<p>So we discussed mechanisms for mapping an Atom URL to an XMPP PubSub Node.&nbsp; (We pretty much ignored RSS because RSS isn&#8217;t as cool for reasons I really don&#8217;t understand.)&nbsp; We talked about putting a link-rel tag in the feed to point to the XMPP PubSub node.&nbsp; This would look something like&nbsp; 
</p>
<p><code>&nbsp; &nbsp;&lt;link rel=&quot;alternate&quot; type=&quot;xmpp/pubsub&quot; href=&quot;xmpp:twitter.com?;node=users/leopd&quot; /&gt;<br />
</code></p>
<p>Even better, the URL for these nodes should be guessable from the URL for the HTTP feed.&nbsp; &nbsp;The above node would be the normal place to look for a the pubsub version of http://twitter.com/leopd.&nbsp; Even though it&#8217;s not as generic and robust to have a standard mapping like this, I think it&#8217;s an important way to speed adoption of a new standard.&nbsp; The code to do a bit of string manipulation is vastly easier than fetching the URL and looking for a link-rel tag.&nbsp; And developers are intrinsically lazy (for good reasons!) so making things easier for them means they&#8217;ll succeed a lot more.</p>
<p>Ever pragmatic, Blaine pointed out that we should use HTTP for things it is good at, and not re-invent them in XMPP.&nbsp; I wholeheartedly agree.&nbsp; <strong>Re-transmission</strong> is a key example.&nbsp; What happens if a client is offline when a new post happens, and so never hears about it?&nbsp; Answer: The <strong>clients should fetch the historic archive of the feed over HTTP</strong>.&nbsp; These feeds exist today &#8212; no need to improve on them.&nbsp; If all the posts have sequence numbers on them, then it&#8217;s easy to figure out if you&#8217;ve missed one.&nbsp; So <strong>all the posts from a user should have sequence numbers</strong>.&nbsp; I don&#8217;t think this is standard in Atom feeds today.</p>
<h3>The story unfolds&#8230;</h3>
<p>There&#8217;s a lot more to be worked out and standardized here.&nbsp; And clearly many more people need to voice their opinions before we can reach consensus.&nbsp; Sadly I can&#8217;t be down in Portland today to continue the discussion, so this post will have to take my place as I return to my regular daily commitments.&nbsp; If you&#8217;d like to stay tuned as the story unfolds, you&#8217;ll have to poll this site, as I can&#8217;t yet give you a PubSub node to subscribe to for updates.&nbsp; If I could it would probably be something like xmpp:embracingchaos.com?;node=xmpp &#8212; try it.&nbsp; By the time you read this, it might be working!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.embracingchaos.com/2008/07/xmpp-pubsub-a-g.html/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Isolate your Continuous Integration Server!</title>
		<link>http://www.embracingchaos.com/2006/10/continuous_inte.html</link>
		<comments>http://www.embracingchaos.com/2006/10/continuous_inte.html#comments</comments>
		<pubDate>Fri, 20 Oct 2006 14:08:00 +0000</pubDate>
		<dc:creator>leodirac</dc:creator>
				<category><![CDATA[Electronic Security]]></category>
		<category><![CDATA[Software Engineering]]></category>
		<category><![CDATA[System Architecture]]></category>

		<guid isPermaLink="false">http://wp.embracingchaos.com/2006/10/continuous_inte.html</guid>
		<description><![CDATA[Here's a little food for thought about hacking into a development system. If you wanted to gain control of somebody's network how would you do it? Well, you'd probably try to figure out a way to get one of the computers on the inside of their firewall to run some code for you. If you could get it to run an arbitrary block of code that you wrote, then you're probably pretty close to 0wning it. Now think about the continuous integration server in your development farm. What does it do? Whenever anybody checks in new code, it runs all...
]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a little food for thought about hacking into a development system.&nbsp; If you wanted to gain control of somebody&#8217;s network how would you do it?&nbsp; Well, you&#8217;d probably try to figure out a way to get one of the computers on the inside of their firewall to run some code for you.&nbsp; If you could get it to run an arbitrary block of code that you wrote, then you&#8217;re probably pretty close to 0wning it.</p>
<p>Now think about the continuous integration server in your development farm.&nbsp; What does it do?&nbsp; Whenever anybody checks in new code, it runs all the unit tests to make sure they still pass.&nbsp; Or, look at it this way: it takes whatever code anybody checks into the source control system and &#8230; <em>compiles it and runs it</em>.&nbsp; This means that unless you&#8217;re being really careful, anybody who has write access to your source control system has control over your CI server and complete access to your network.</p>
<p>Old strategies for containing this mess included running the CI daemon as a limited authority user or chroot&#8217;ing the process.&nbsp; These days, I think putting the CI server in a dedicated virtual machine is the way to go.&nbsp; VMWare&#8217;s newly free <a href="http://register.vmware.com/content/download.html">Server</a> product is perfect for this.&nbsp; </p>
<p>These strategies can limit what somebody can do to the CI server itself.&nbsp; But regardless of that, they&#8217;ve got open access to your network from inside your firewall.&nbsp; So if you&#8217;re being really paranoid (a fine quality in a system administrator IMHO) cut it off from the rest of the network except the source control server.&nbsp; The CI machine generally needs to be able to send e-mail to let folks know when things break, which means it also needs outbound SMTP access.&nbsp; The smart hacker will use this to impersonate somebody within your org and get deeper in through social-engineering.&nbsp; The best way I can see around that is to have a process on the email server poll the status of the last build on the CI system (say over HTTP, perhaps checking an RSS feed that many CI systems support) and send e-mail as appropriate.&nbsp; Remember &#8212; your network firewall rules isolating this box don&#8217;t have to be symmetric.&nbsp; It shouldn&#8217;t be able to see out, but other boxes can still get in.</p>
<p>This should make you think seriously about how accessible your SVN server is.&nbsp; What kinds of passwords do your users have on it?&nbsp; Do you require HTTPS?&nbsp; Do you require client certs?&nbsp; What about cached SVN credentials on all those dev boxes?&nbsp; Remember &#8212; if you&#8217;re running a CI server, SVN write access in the wrong hands translates pretty quickly into a whole lot more access.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.embracingchaos.com/2006/10/continuous_inte.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

