Posted by Graham Stratton
Mon, 12 Feb 2007 15:13:30 GMT
Rsync is one of those wondrous unix utilities that it’s difficult to imagine a world without. Amongst many other uses, it’s a really good way to do daily backups.
Using SSH, rsync is secure too. But for automated access it’s a bit tricky. In order to do automated backups, one needs to set up some sort of passwordless login. This is done by generating an RSA key pair on machine A, and copying the public key to machine B. Now when A tries to log in to B, B has a way to test that A really is A. Cool. But that means that A being compromised leads trivally to B being compromised.
The solution is to limit the range of commands that A can execute on B using a particular SSH key. One can limit it to a single command by prefixing a line of authorized_keys like this:
command="/bin/echo You may not do anything useful"
Now whatever command is sent, this is the command that will be executed.
More about restricting SSH is available in chapter 8 of O’Reilly’s book at http://www.oreilly.com/catalog/sshtdg/chapter/ch08.html There are a number of other useful options such as no-port-forwarding, no-X11-forwarding, no-agent-forwarding and no-pty – don’t forget options should be separated by commas but no whitespace.
So it’s quite easy to ensure that only a single command is run. But it’s not so easy if you don’t know exactly what the command will be, as with rsync. In this case, the trick is to run a script which decides whether the original command is permissible. The requested command is available as the environment variable $SSH_ORIGINAL_COMMAND (or $SSH2_ORIGINAL_COMMAND if you’re using SSH2, I believe).
There is a useful script here: http://servers.linux.com/article.pl?sid=04/11/04/0346256
It checks that the command doesn’t contain ; or & characters (ie, there is only one command, and that it begins with ‘rsync—server’. If the command matches, it runs it, otherwise it rejects it. This means that you can’t do anything else with the key you are using for backups (which is good). But what if you want an SSH identity that you can use for manual logins, which is password protected and stored in your keychain? Easy enough, just create to RSA keys.
To specify which key ssh should use, use the -i option:
rsync -e "ssh -i .ssh/id_rsa_backup" --recursive -L /home/graham/tobackup/* back.up/server/
Posted in Linux | 4 comments
Posted by Graham Stratton
Wed, 22 Nov 2006 21:54:00 GMT
It sounded like a trivial task: find a library which will produce pie charts for use on the web which has a python interface. Then install it and use it.
Well, it’s taken me most of a day, and I haven’t got there yet. There are a number of small tools which might do the job, but none of them looked quite right, so I thought I’d go the whole way and install matplotlib, which builds on scipy.
I thought I’d see whether these packages were available from the cheese shop. I think they all are, but the install of scipy failed. The first problem was that scientific stuff is written in fortran, so I had to install (from source, no darwinport) gfortran, the GNU F77 compiler.
Numpy installed fine, but scipy required fftw (which also had to be compiled from source, and some other things.
Compilation then stalled with:
/usr/bin/ld: can’t locate file for: -ldfftpack
which was solved by running
python setup.py build_clib
which creates libdfftpack.a, before continuing before the standard build. Poor dependency handling between numpy.distutils commands is apparently to blame.
Then I ran the scipy tests and some of them failed. There’s a point where it says it would use the Atlas libraries if they were available. I spent a while trying to get them to install, but failed.
I moved on to attempting to install matplotlib. I found a suitable egg, and tried to easy_install it. A linking error for libpng.
sudo port install libpng
didn’t produce an error, but didn’t fix things. Compiling from source manually did. Then I got errors about not being able to find libfreetype. I though a sym-link to the library deep in /Developer/SDKs/MacOSX10.4u.sdk/ would do the trick, but it wasn’t enough. So I depaired, and returned to Google.
I found some pre-built packages for OS X, but they used the system python 2.4, where I was happy with my compiled version, and didn’t want a new one, and figured that installing a new system python would confuse things.
Darwinports also contained a port of matplotlib, so I thought that I would try that. That also insisted on installing its own python, but I was despairing, so I tried it. The full install succeeded, but
import pylab
yielded
ImportError: No module named wx
For some reason I concluded that this was more than the simple dependency problem it appears to be, and decided to try to pre-built packages instead. They installed successfully, I just need to work out how to run the python interpreter I want.
The new system python is at /usr/bin/python – but none of the tons of stuff I had already installed is available. So I’m not sure that I’ve gained anything. Especially since
import pylab
now yields
ImportError: No module named numpy
and fixing that returns me to
No module named wx
Installing the wx binaries left me with a working import. Inspired by this, I installed the darwinports binaries with
sudo port install py-wxpython
I’m hoping this will solve my problems!
But no. Now I’ve even got two terminals which report the same path for ‘which python’, but actually start different versions of the interpreter.
In summary, it’s all a mess, and I ought to learn a bit more about how compiled languages work (in particular, how the linker looks for libraries). I’m going to have to install this on a debian system too at some point. I’m anticipating it being rather more trivial.
Posted in Python, Mac | 33 comments
Posted by Graham Stratton
Fri, 27 Oct 2006 10:25:05 GMT
I’ve been trying to work for too many hours a day recently, so I thought I’d try to take a bit of a break today. So I’m spending it reading up on some of the things that I think I ought to know about but don’t.
I had a quick look at Schevo earlier. Schevo (abbreviating schema evolution) is a tool which builds on top of an Object database, such as the ZODB or the much smaller but less powerful Durus. It allows you to specify schema, unique keys and other features associated with relational databases. Thus the idea is to provide some sort of hybrid system with enough advantages of both to be better than either for some tasks, in particular rapid web development – not a minor market. There seems to be a severe lack of examples and little discussion since it was presented at Pycon in 2005, but work on it is still progressing. This is definitely something I’d take a closer look at if I had an infinite amount of time.
Back to the intended topic of this article: REST, short for REpresentational State Transfer. It’s something that is increasingly talked about, and some people seem to be very passionate about, but is it something I need to take seriously?
Firstly, the term REST is not being used in exactly the way it was originally intended. As the Wikipedia article says:
The name “REST” is, circa 2006, finding frequent use in a loose sense to describe some programmatic interfaces to portions of the World Wide Web that use Extensible Markup Language (XML) (or, less commonly, YAML, Javascript Object Notation (JSON), or plain text) over the Hypertext Transfer Protocol (HTTP) without an intermediate messaging layer. This usage of the name “REST” serves to distinguish the interfaces from those interfaces which employ Simple Object Access Protocol (SOAP) or remote procedure call (RPC).
So, it’s all about web services, then, and those of us who are producing browser-based stuff can ignore it? Well, maybe. But one of the great advantages of REST-based interfaces is that the XML can easily be modified, for example by XSLT, into standard user interfaces, provided that one has used one’s GETs, PUTs, POSTs and DELETEs correctly. So the line between a web service and a browser-accessible service isn’t as clear as you might expect.
The discussion of SOAP versus REST is an interesting one; for one side of the argument, Paul Prescod has an article on the benefits of REST. Paul gives a lot of reasons that REST is better than SOAP. To add one more, SOAP isn’t as Simple as it might be.
An analogy
I often object to analogies. And I don’t think OO programming is the solution to everything either. But still, where SOAP and RPC are internet versions of procedural programming, REST is about objects. When using REST, URIs should refer to resources, not to services. I do feel that REST is better for it, though this does mean that all the things you can do to a resource have to be done through a predefined set of methods.
SOAP also requires the caller of a function to know about the parameters it takes. It’s a complex situation, and WSDL might make it easier, but it feels rather a lot like the unresponsive world of static typing to me. I don’t think it’s surprising that the dynamic-typing people are nearly all REST supporters.
More on REST/SOAP
I haven’t managed to cover the debate very well; I suggest anyone interested reads Paul Prescod’s REST/SOAP debate. There, he points out that inventing new protocols is bad, HTTP is good enough for most things, that the great strength of HTTP is URIs and that being able to just GET any object provides a great deal of power.
Summary for Web Developers
But getting back to what people developing web sites need to know. Well, you should use POST if you’re going to change anything, and GET otherwise. Repeated calls to a GET request should return the same thing. Well, sort of. To be more precise, making a GET request now shouldn’t affect the result of making a GET requeest in a minute’s time.
GET requests can be cached. If you try to repeat a POST request by using the back button in your browser, then you will be warned, which is what you probably want before a state-changing request. So POST is safer than GET. There are also a couple of other reasons to use POST: GET parameters are encoded in the query string, which means that the length of the parameters is limited and that non-ascii characters cannot be encoded.
Browsers don’t currently support anything other than GET or POST, though. According to HTTP, POST should be used to create a new object, PUT to modify and object and DELETE – well, yes. For a web service you’d do that correctly. The Rails people have announced ActiveResource, which makes it easy to create a web service, and supports a hidden field so that browsers pretend to do PUT and DELETE properly.
Isn’t it all a mess? One thing which isn’t is the question of how to easily create a link to a POST request without having to create a form. It seems that using a query string is an equally valid approach in POST as it is for GET, though you’ll have to work out what your library does with the query string parameters once it receives them, which may depend on the request type.
1 comment
Posted by Graham Stratton
Tue, 24 Oct 2006 23:02:29 GMT
I mentioned in a previous post that showing status for web file uploads is a hard problem. At that point I couldn’t see how it would be possible to watch a file upload from anything other than a simple CGI script, but fortunately James Gardner has kindly proven me wrong. (I don’t really appreciate being proven wrong, but if it’s going to happen then the sooner the better. And in this instance James did give me the results of a day’s coding, for which I am very grateful.)
The Python Web Server Gateway Interface (WSGI) is rapidly becoming the standard gluing point between python web frameworks and various servers. See PEP 333 for the specification and James’ article for more details.
Happily, running an application behind the WSGI doesn’t prevent you from being able to watch the status of an upload. Your application is called as soon as the headers are parsed, so you can do what you like with the body of the request. Letting a slightly modified cgi.FieldStorage deal with it is probably the correct idea. Then you can check the status of a file of known name independently of the upload. Example code from at least one of us will appear at some point to make it all clear!
Asynchronous Uploads
But what I actually want to write about now is asynchronous uploads. Rather than a user having to first select all the files she wants to upload and then wait for them to upload, the upload can be progressing whilst she is selecting later files.
I decided that this is the way to go. Once uploads complete, then the user can be shown thumbnails of the uploaded images, so she can see which ones she has selected and change her mind. Since whichever frame an upload comes from blocks until it gets a response, uploads have to be done in a separate iframe. When the user has selected a file, the iframe in which they selected it is hidden, a new one is created in its place, an upload status is added to the list of selected files, and the upload begins. Simple, huh?
Well, it all went a little wrong. Whilst I had working status for a single upload, Firefox stopped giving me status updates when I tried to upload two files at once. This turned out to be due to the number of persistent connections being limited to two by default (go to about:config and look for network.http.max-persistent-connections-per-server). So that was the end of that for a while. Even changing the config settings (not a useful thing, as you can’t get users to reconfigure their browsers) didn’t really work, as the browser became really unresponsive.
What I’ve now decided to do is to add new files to the end of a queue of files to upload; only one is uploaded at once, but still no potential upload time is wasted. Hopefully I’ll get that new version of the code finished soon, and make it available.
no comments
Posted by Graham Stratton
Tue, 17 Oct 2006 08:55:00 GMT
For a long while I thought that Nikon had made a mistake by not using full frame sensors. Surely Canon’s large sensors were going to collect more light and therefore always be more sensitive? But then I realised I was wrong. The sensor size in a system does not affect it’s sensitivity – only its potential resolution, and there seems to be no problem with cramming enough pixels on a small sensor.
So, why did Canon start making full-frame cameras? Simple, because they have lots of nice 35mm lenses around. And maybe because they couldn’t fit as many pixels as they wanted on a small sensor. But mainly because of the lenses.
Lenses
The job of a lens is to take all the light which comes from a certain set of directions and enters the front of the lens, and focus it onto a sensor. Whatever size our sensor is, there will be the same amount of light entering the front of the lens. With a smaller sensor the lens just needs to focus it onto a smaller area. Since there is always the same amount of light coming in, a sensor of the same sensitivity will yield the same quality of image when the light is focused onto it, regardless of its size.
Using a 1.6 times crop factor sensor with a 35mm lens is exactly equivalent to using the same lens with a 1.6 times teleconverter on a full-frame camera. In both cases, only a small amount (about 1/1.6^2 ~= 0.4x) of the light collected by the lens is projected onto the sensor. Using the teleconverter increases the focal length by 1.6x, and at the same time reduces the f-number by 1.6 times.
f-numbers are most usefully thought of as a ratio of subject brightness (per unit of solid angle) to sensor brightness (per unit area). Hence when one uses a teleconverter to spread the light out over a greater area, the effective f-number decreases. Since a small sensor needs to get more light on a smaller sensor, a higher f-number is needed. But since the light can be focused onto a smaller area, a higher f-number can be attained.
Effective f-numbers
The concept of focal length multipliers seems to be generally accepted. A 50mm lens on a 1.6x crop camera acts like an 80mm lens on a full-frame camera. But what is not yet accepted is the concept of an aperture number multiplier. A 100/2.0 on a 1.6x crop camera acts not as a 160/2.0, but rather as a 160/3.2.
So, am I saying that if you wanted an effective 80/1.2 for a cropped camera, you’d need a 50/0.75? Yes, I am. Isn’t that impossible to make? No.
Suppose you had a 0.625x teleconverter. Connect this to a full-frame 80/1.2. What do you get? A 50/0.75 (but which only covers the cropped sensor). (Admittedly it is not actually possible to make such a teleconverter, as you would need to move the lens closer to the sensor, just as a 2x teleconverter moves it further away). A 0.625x teleconverter would make any full-frame lens into the equivalent cropped-sensor lens, just as the a 1.6x teleconverter makes a full-frame lens behave as if it were on a cropped-sensor camera.
The future
So, we can all move to smaller sensors with lighter lenses, then? Er, no. Lenses will still need to be just as big. The fact that current cropped-sensor lenses are light is because they have an effective f-number that no one would buy in a full-frame lens.
So far, the move to small sensors has yielded lighter, cheaper lenses. Manufacturers have used the transition to sell lenses with effective f-numbers that would never have sold before. To get effective f-numbers where we need them, we will need to see some striking specifications. Expect to see f/2.0 zooms and f/1.0 primes.
I heard a ruumour that Canon are considering doubling the sensor size since they’ve reached the limit of the sensitivity of 35mm sensors. That’s not going to happen. The interesting question is as to whether Canon will abandon full-frame sensors at some point. As they’ve so far only produced a couple of L-series quality EF-S lens, I don’t expect it to happen soon. In fact, Canon seem to be trying to pull people towards full-frame sensors. That’s fine, provided they can make them nearly as cheaply (actually I believe that full-frame sensors are currently very expesive to make, though they may get cheaper – although chips normally get cheaper because they get smaller). Full frame is also a great idea from a marketing perspective, as most people still believe full-frame is fundamentally better.
I expect to see more EF-S (cropped sensor) lenses from Canon soon. If Canon don’t produce such lenses, I think we will see more people buying from manufacturers who are making cropped-sensor only lenses, such as Sigma. By designing for cropped sensors only, manufacturers can make lenses which are over a stop faster but roughly the same size, weight and price as full-frame lenses of the same focal length.
Notation
Isn’t all this ‘effective’ focal length and aperture stuff getting a bit confusing? Yes, but I don’t think it’s going to change. Lenses for cropped sensor cameras will continue to be specified as if they produced an image which covered a full 35mm sensor.
Conclusion
So, is sensor size really arbitrary? Well, not entirely. Ignoring by far the most important aspect, which is what lenses already exist, there are a couple of other considerations:
a) Whether the manufacturer can fit enough pixels on the sensor to get the resolution they want
b) Whether getting enough logic or light on a small sensor makes it warmer, and therefore more noisy
Does this mean you shouldn’t buy a full-frame camera? Well, if you’re not sure you could give it to me instead. But seriously, there are a lot of very nice 35mm lenses out there which you can only take full advantage of with a full-frame sensor. But what it does mean is that you shouldn’t feel you are missing out on potential sensitivity by buying into a system which only supports cropped sensors, providing you feel that the lenses you want exist or will exist.
96 comments
Posted by Graham Stratton
Fri, 13 Oct 2006 09:23:00 GMT
Until recently I hadn’t realised that there is an issue with file uploads, but now I have and there definitely is.
The problem is that it’s very difficult to give the user information about the progress of their upload, and if she is uploading a large number of large files this might be an issue.
Photobox have gotten round the problem by writing a java applet which is responsible for uploads, but this is overkill for many projects. (But it does also allow users to efficiently upload large numbers of images). Facebook also have a java uploader for bulk uploads. Single file uploads use the iframe trick (more on this later) to display a static message whilst the file is uploading.
Client-side javascript can’t control file uploads, since files are OS-level objects and JavaScript shouldn’t be allowed anywhere near them. Any information about the progress of an upload must come from the server. There are a couple of commercial projects which offer software to solve this problem. Both are Perl scripts. There are no PHP solutions since PHP does not provide any way to monitor the progress on an upload.
It should really be the job of the browser to keep the user informed about the status of uploads, but no browsers seem to do so.
Problems on Both Sides
So we need to implement something on the client side to display status information, and something on the server side to monitor the status of the upload. Finally we need to get them to talk to each other. Strangely enough, this bit is probably the easiest bit, thanks to AJAX libraries.
Client side
The problem on the client side is that when a form is submitted, the frame from which it is submitted blocks until the request is completed, so there is no way that it can display status information. A common trick here is to use a hidden iframe for the upload fields. There are also slightly more complex examples using multiple iframes. (An iframe, by the way, is an inline frame, such as is often used to produce a scrollable box within a page.)
Server side
Here things are even more tricky, as we need to measure how much of a file we have received, and then provide a function that the client can call to return this information. This is very complex, but not as restrictive as I may have once claimed.
Posted in Python, Javascript | 3 comments
Posted by Graham Stratton
Fri, 06 Oct 2006 11:22:00 GMT
Having decided that javascript is now mature enough to be worth having a look at, I needed to choose a nice javascript library. I liked the Mochikit screencast, but I thought I ought to know what else is out there.
I came across this rather good comparison of javascript libraries.
After reading it I decided that sticking with Mochikit for a while was the way to go. I feel much better about that now that I have an idea of what I am missing elsewhere.
Bob Ippolito has ported many things such as effects from Scriptaculous to Mochikit, leaving the APIs unchanged, which means that switching to Scriptaculous at a later date might not be too hard anyway. But whilst experimenting, that interactive interpreter looks invaluable.
I’ve just viewed the Mochikit screencast again. It’s even better than I remembered it being! If you haven’t seen it you really ought to!
Posted in Javascript | 1 comment
Posted by Graham Stratton
Mon, 02 Oct 2006 13:55:00 GMT
As mentioned in the wikipedia entry on OpenID, OpenID was created by one of the LiveJournal developers.
The idea of OpenID is that users will be able to identify themselves by a single identifier, such as a URL. They can choose a single server to be responsible for their OpenID validation, instead of having to have many usernames and passwords stored on many computers around the planet.
User’s view
To use OpenID validation, one must first create an OpenID identity on a validation server. There are already a number of companies offering this service, for example MyOpenID.
Then, to use a website which supports OpenID, all users need to do is to enter their identity (say a URL). After that, there are two things which may need to happen. One is that they made need to log in to their OpenID server, if they have not logged in during their current browser session. The other is that they made need to tell their OpenID server that this is a safe site to give their ID to. If either of these is the case, users will find themselves redirected to their OpenID server for this purpose, after which they will be redirected back to the site for where they came. If users are already logged in and have previously given their OpenID server permission to access the site, then users do not see any login happening, and they can continue using the site.
Behind the scenes
So what’s actually going on to make all this work?
Well, I read a few summaries and eventually tried the OpenID 1.1 specification, which is actually about the clearest description out there. Nevertheless, I’ll have a go at describing it.
When a user wants to log in to targetsite.com, they fill in a form including their OpenID, say firstname.surname.myopenid.com. Targetsite.com then fetches this URL. In the page returned will be a link showing where their OpenID server is, like this:
<link rel="openid.server" href="https://www.myopenid.com/server" />
So: your identiity page does not need to be hosted by your OpenID server. So, provided you put the above line in it, you can use your homepage URL, myhomepage.com, as your OpenID, provided you have a found an identity provider (IdP) who will assert your control of the identifier myhomepage.com. But what if your identity provider will only assert your ownership of a URL on their site, say firstname.surname.myopenid.com?
In this case, you can delegate the identifier to a different identifier, so your homepage should contain:
<link rel="openid.server" href="https://www.myopenid.com/server" />
<link rel="openid.delegate" href="http://firstname.surname.myopenid.com/">
Then, to validate your ownership of your homepage, targetsite.com will ask myopenid.com/server/ whether you own firstname.surname.myopenid.com. (OpenID 2.0 is different, see Using your own URL as an OpenID.
Retuning to the plot, targetsite.com will redirect the user agent (ie the browser) to the IdP’s checkid_immediate URL. By redirecting the user agent their, the IdP can access cookies or other credentials in order to check whether the current session is authorized.
Tags openid | 1 comment
Posted by Graham Stratton
Thu, 31 Aug 2006 21:33:47 GMT
There is growing interest in scripting languages amongst SAP developers. As a python enthusiast, I naturally wanted to use python to interface to SAP.
Scripting languages have to connect to SAP via RFC calls. saprfc is a library to do this in python created by Piers Harding. It’s available on the cheese shop.
For web interfaces there are a number of python web frameworks available. Django and Turbogears are possibly the best known of these. Like Rails, Django is very much centred on database-driven applications. Since interfacing to SAP is not what your average web framework is designed for, I wanted a framework where all the components could be easily replaced. Pylons was suggested, and I have to say I’m really enjoying using it.
Pylons is well documented, flexible (for example, you can easily change templating engines or even mix templates in a controller), and generally feels very clean. For database apps you can use SQLAlchemy or SQLObject for ORM; SQLAlchemy is highly recommended for its flexibility. Form validation is easy using FormEncode, without requiring automated form generation, leaving you in full control of page design.
Posted in Python, Pylons | 2 comments
Posted by Graham Stratton
Tue, 29 Aug 2006 15:19:31 GMT
Having got my account set up so that I can send text messages from software, I wondered what I could do with it!!
I thought that I could provide a form on my website so that people could text me. But I wouldn’t want any robots to be able to use it. Now I don’t imagine there are any robots which go around looking for text message forms, but this got me thinking about ways to avoid such problems.
The standard solution is to require the user to identify text in an image (and if the developer is feeling like making things accessible, providing an audio alternative). Such tests are known as CAPTCHAs. There is a python project pycaptcha, which generates images using PIL. I installed it, but it didn’t work due to the PIL being built without FREETYPE2 support. If I can actually think of a use for it, I’ll try it sometime.
I recommend fastsms.co.uk if you want to send text messages from software to recipients within the UK; it feels like they do most things right.
Posted in Python | 2 comments