The Bit Depth Blog – Page 2

26 June, 2009

Improved tools for optimising PNG images – pnqnq and pngquant “Improved”

I’ve taken the liberty of creating WIN32 builds of Kornel Lesinski’s improved pngnq and improved pngquant open source tools.

pngnq and pngquant reduce the file size of a PNG file by reducing the number of colours, using advanced algorithms which produce the most visually pleasing results given the limitations.

pngnq and pngquant are capable of producing 8-bit (or less) optimised images which still contain alpha transparency (fully varied transparency), something which Photoshop cannot do!

This package contains

Improved pngnq v0.9 (January 2009)
Improved pngquant v1.1.1 (January 2009)
Sample batch files
Full source code and any original open source licenses

Download pngnq and pngquant command-line tools for Win32 now

What they’re for

pngnq and pngquant perform lossy compression of PNG images, by reducing the number of colours appearing in the image.

There are many different approaches to doing this, and most graphics applications capable of saving to PNG or GIF images have some algorithm for reducing a full colour image down to an image with a limited number of colours. Different algorithms vary in terms of visual quality and processing time. pngnq and pngquant aim for maximum possible visual quality at the expense of a longer processing time (though they are, to some extent, adjustable), and generally do perform better at this than most graphics software.

Choosing a fixed number of colours to best present a full-colour image is not an easy task, and as such there are a few different approaches.

One approach is to use a fixed set of colours regardless of the image; this is the simplest but also the worst quality approach.

Other approaches look at what colours actually appear in the image, and try to cover the most common ones. Of these, there is still a variety of approaches: median-cut based colour selection picks colours by repeatedly calculating median values of the colours in the full-colour image. Other, more ‘perceptual’ approaches try to place more emphasis on areas of the image where small colour variations are more likely to be noticed by the human eye, and less emphasis on ‘busier’ areas of the image. In the best case, such an algorithm can often produce a reduced colour image that is indistinguishable from the original, by the human eye.

pngquant is a general open source tool to do just this, and can accept pretty much any type of PNG image as its input, though a true-colour PNG, optionally with alpha transparency information, is best. It is a command-line tool, and is cross-platform.

pngnq is an alternative to pngquant which uses the neuquant algorithm, a more complex algorithm which aims to produce better results. It is also command-line and cross-platform. It evolved from an earlier version of pngquant.

Kornel Lesinski’s improved pngnq and improved pngquant tools add some further minor improvements to pngnq and pngquant’s algorithms, giving more pleasing (to my eye) results for images with alpha transparency, especially antialiased boundaries for example on icons. They also contain other various fixes, as documented on their respective web pages, which improve results in some edge cases.

You can choose how many colours you want to end up with, with 256 as the maximum; unlike GIF, PNG’s efficiency does not really suffer if you choose a palette size that is not a power of two.

The resulting PNG images will be viewable in all modern browsers, with an important exception: in Internet Explorer 6, images with alpha transparency will not display their partially transparent areas. Thus, the images will look a lot like they have just 1-bit transparency. Some see this as still better than the alternative of not using alpha transparency, or using full-colour images with alpha transparency and having them completely broken on IE6, due to the relatively graceful way these images degrade on IE6. You should test the results in IE6 and decide for yourself, on an image-by-image basis.

Download pngnq and pngquant command-line tools for Win32 now

17 June, 2009

Figuring Twitter out

The success of Twitter always puzzled me – it was probably the first massively popular web phenomenon that I just could not get. I don’t spend a lot of time on World of Warcraft, or Second Life, but at least I can easily understand their appeal, and why people use them. For Twitter, I just could not understand it.

‘Why would anyone use a service so restrictive and limited in use, and like it?’, I thought. For one, it’s full of self-promotion, utilised by many as one big PR tool. I felt like I was being spammed every time I visited (until, that is, I learned to un-follow any of ‘those’ accounts), and at other times it just looked like a whole bunch of ‘in’ people sharing ‘in’ jokes and having their own little private conversations to which I was not invited, but which were nonetheless put out in the public, as if to say ‘look at me, I actually have friends’ or ‘geek is the new trendy’.

To help myself to understand it – without necessary aiming to like it, but just to get a better grasp of why others did – I told myself eight days ago that I would put something on my Twitter (Tweet?) at least one a day.

Installing TwitterFox has helped. Now I don’t need to go through the hassle of loading a website just to do something which should be trivially easy to do, given you are writing just a few words.
The @ signs in messages annoy me. Twitter should just hide the @, allowing you to link to another user without it looking like some sort of secret geek language.
The need to use URL shorteners annoys me. Twitter should just hide the URL and show it as link text instead, allowing the link to show up as part of your sentence, you know, like in HTML. This is probably the single most embarrassingly backward feature of Twitter, so lacking that an entire industry of ‘URL shortening services’ has thrived as a result of this limitation. Imagine their lack of
Twitter is not open. It’s controlled by one company (and doesn’t have a good record of staying up). Can I host a Twitter site on my own server? I guess this point is kind of moot; as much as I find it mildly irritating, most people, unlike me, don’t really care too much about ‘freedom’ in that sense. But even a viable Twitter competitor would open up the market a bit. I may check out identi.ca at some stage.
I have been inspired by the likes of Sockington, who I think uses the Twitter format really well.

All in all, I’ve found that it is actually easier to maintain a regular posting habit on Twitter than on my blog, which is refreshing. A week ago I didn’t understand the point of Twitter at all, and while I still see Twitter as largely made up of chaff I can now understand that it is like any other self-published media, including blogging: 95% of it is uninteresting, but the 5% that’s good can make it an enjoyable experience.

You can check out my Twitter here and tell me if I am a twit or not.

14 May, 2009

Distributed Version Control Systems are not Voodoo

Distributed version control is nothing to be scared of. It may be a lot simpler than a lot of people or ‘introductory tutorials’ have led you to believe.

Distributed version control doesn’t force you into a new way of working. If you prefer a centralised approach, you can still do that. You could use it very much like you currently use svn if you want to, and yet you’ll still get the benefit of having a local history of revisions, allowing you to do a lot of things such as revert, diff, or even check in some code locally – without the overhead of network access.
Distributed version control does not force you to have a mess of different branches of your code on different developers’ machines, with no clear indication as to which one is most ‘up to date’. You would still have a ‘main’ or ‘trunk’ branch on a server somewhere which represents the main focus of development. Developers get the added option of having new branches on their own machine, but this doesn’t mean that they can’t send that code back to some shared trunk when they finish something.
The benefits to distributed version control do not end at the ‘coding while on a plane’ scenario. While this is indeed a benefit of distributed version control (while coding without network access you still have access to the complete revision history and can do commits), this is not the sole reason behind DVCS. You also get the benefit of reduced network overhead, making typical operations much faster, and you get a lot more flexibility in choosing a suitable workflow, and altering that workflow when it suits you. If you are working on your own pet feature but don’t want to check in your code to the main trunk until it’s done, you can still easily use revision control on your own little copy, and when it does come time to check in and merge all your changes to the trunk all the little revisions you made when you were working on it separately are preserved. This requires no setting up on the central server – any developer can create their own little parallel branch for a while, then merge it back up – it can be as simple as doing a ‘local commit’ instead of a ‘commit’ or ‘check-in’ (for me, that’s just one checkbox). Merging may sound complicated, but it’s not – it simply works. The system is smart enough to figure out which file was which and you can’t really break it – anything is reversible.
Not all DVCS are git, or particularly like git. Git originated as a special-purpose version control system which Linus Torvalds developed to support his own personal workflow, managing the Linux kernel. While it now has pretty much all the features of a full-featured version control system, it still has its origins as a tool built especially for one person’s own workflow. Undoubtedly it is a good system, but if you try git and don’t like it, you should not assume that other distributed version control systems are the same as it. Git also has problems on Windows (a native Msys version is still being developed as of this blog post).

My version control system of choice at the moment is Bazaar, chosen because I need both Windows and Linux capability and I like that it is easy to use. There is not much separating it from Mercurial except for some small, almost philosophical conventions. Just as one almost insignificant example, Bazaar versions directories, allowing empty directories to be significant. I’d recommend either. Bazaar has the massive support of Canonical (of Ubuntu fame) and big projects such as MySQL. Mercurial has big projects such as Mozilla (of Firefox fame). You can get near instant help for either by going to their respective freenode IRC channels or by asking a question on Stack Overflow.

If you are interested, try out this “Bazaar in five minutes” tutorial. Longer tutorials often give the impression that distributed version control is more complicated than it really is.

29 April, 2009

Three ways to work with XML in PHP

‘Some people, when confronted with a problem, think “I know, I’ll use XML.” Now they have two problems.’
– stolen from somewhere

DOM is a standard, language-independent API for heirarchical data such as XML which has been standardized by the W3C. It is a rich API with much functionality. It is object based, in that each node is an object.DOM is good when you not only want to read, or write, but you want to do a lot of manipulation of nodes an existing document, such as inserting nodes between others, changing the structure, etc.
SimpleXML is a PHP-specific API which is also object-based but is intended to be a lot less â€˜terseâ€™ than the DOM: simple tasks such as finding the value of a node or finding its child elements take a lot less code. Its API is not as rich than DOM, but it still includes features such as XPath lookups, and a basic ability to work with multiple-namespace documents. And, importantly, it still preserves all features of your document such as XML CDATA sections and comments, even though it doesnâ€™t include functions to manipulate them.
SimpleXML is very good for read-only: if all you want to do is read the XML document and convert it to another form, then itâ€™ll save you a lot of code. Itâ€™s also fairly good when you want to generate a document, or do basic manipulations such as adding or changing child elements or attributes, but it can become complicated (but not impossible) to do a lot of manipulation of existing documents. Itâ€™s not easy, for example, to add a child element in between two others; addChild only inserts after other elements. SimpleXML also cannot do XSLT transformations. It doesnâ€™t have things like â€˜getElementsByTagNameâ€™ or getElementByIdâ€™, but if you know XPath you can still do that kind of thing with SimpleXML.
The SimpleXMLElement object is somewhat â€˜magicalâ€™. The properties it exposes if you var_dump/print_r/var_export donâ€™t correspond to its complete internal representation, and end up making SimpleXML look more simplistic than it really is. It exposes some of its child elements as if they were properties which can be accessed with the -> operator, but still preserves the full document internally, and you can do things like access a child element whose name is a reserved word with the [] operator as if it was an associative array.

You donâ€™t have to fully commit to one or the other, because PHP implements the functions:

simplexml_import_dom(DOMNode)
dom_import_simplexml(SimpleXMLElement)

This is helpful if you are using SimpleXML and need to work with code that expects a DOM node or vice versa.

PHP also offers a third XML library:

XML Parser (an implementation of SAX, a language-independent interface, but not referred to by that name in the manual) is a much lower level library, which serves quite a different purpose. It doesnâ€™t build objects for you. It basically just makes it easier to write your own XML parser, because it does the job of advancing to the next token, and finding out the type of token, such as what tag name is and whether itâ€™s an opening or closing tag, for you. Then you have to write callbacks that should be run each time a token is encountered. All tasks such as representing the document as objects/arrays in a tree, manipulating the document, etc will need to be implemented separately, because all you can do with the XML parser is write a low level parser.
The XML Parser functions are still quite helpful if you have specific memory or speed requirements. With it, it is possible to write a parser that can parse a very long XML document without holding all of its contents in memory at once. Also, if you not interested in all of the data, and donâ€™t need or want it to be put into a tree or set of PHP objects, then it can be quicker. For example, if you want to scan through an XHTML document and find all the links, and you donâ€™t care about structure.

9 March, 2009

The free and non-free Creative Commons licenses

Some Creative Commons licenses are ‘free’ in the sense that open source software is free.
Other Creative Commons licenses are ‘not free’ in the sense that they restrict use of the material in ways that is counter to ‘freedom’ as defined by the Free Software Foundation or the Open Source Initiative, to draw a parallel with software licenses.

In this article I just wanted to clarify the difference for those using a CC license, so that they are not inadvertently preventing others from using their work with an unnecessarily restrictive license.

Thankfully, the creativecommons.org website now has a useful “Approved for Free Cultural Works” icon and colour scheme, to help you tell them apart. For example:

This Attribution Share-Alike Generic 2.5 license is green and has the icon, so you know it is a ‘free’ license.
This Attribution Non-Commercial Generic 2.5 license is yellow and does not have the icon, so it is not a ‘free’ (as in freedom) license.

Defining freedom

Creativecommons.org has chosen to adopt the meaning of ‘freedom’ as defined by Freedomdefined.org, a definition which is basically equivalent to that used for open source software. It states that for a work to be considered a free cultural work, it must have the following four freedoms:

The freedom to use the work and enjoy the benefits of using it
The freedom to study the work and to apply knowledge acquired from it
The freedom to make and redistribute copies, in whole or in part, of the information or expression
The freedom to make changes and improvements, and to distribute derivative works

Freedom applies to everyone

For these freedoms to be valid, they must be unconditional and apply to everyone, regardless of who they are or what they intend to use the work for. This means that any license with a non-commercial clause is not free in the sense that any business wanting to use the work commercially would have to make a separate arrangement with the author. One of the basic rules of open source software is that businesses are allowed to use it in order to profit from it; if what they could do with it was restricted, open source software would be avoided by commercial enterprises. Companies wouldn’t be installing Linux on their clients’ systems, for example.

The same applies to non-software cultural works: allowing anyone the freedom to use the work, regardless of whether they intend to profit, enables businesses to assist the proliferation of the work.

Freedom includes freedom to make changes

These freedoms also include the freedom to make changes and improvements. If a license does not allow derivative works, it is another example of restricting users’ ability to do whatever they like with the work. The ability to modify the work is seen as advantageous for the community because it allows the work to be improved by others, without a separate arrangement being made with the original author. To draw a comparison with open source software again, if businesses were not allowed to modify Linux and provide their own version of it, many businesses would not be able to exist, and the behaviour of Linux would be entirely under the control of a single entity. Allowing others to modify your work allows businesses to exist that support the work through improving it.

Some restrictions are still acceptable

Requiring copyright notices to be preserved, or requiring any derivative works to be given the same license (a share-alike clause) are still considered acceptable restrictions by the free software movement and free cultural works. It’s just that any restrictions beyond this, such as preventing commercial uses and preventing any modifications, are not.

Quick guide to choosing a Creative Commons license

There is nothing wrong with choosing a non-free license for your work: It is the creator’s right not to license their work, or to apply any restrictions they desire. If you are considering releasing something under a Creative Commons license, you should consider which rights you want to retain. One reason for retaining a right would be if you want to make money from it.

So, here’s a quick guide on how to choose between the licenses:

Including a non-commercial clause allows you to retain the sole right to make money from distributing the work. If allowing others freedom to use the work is more important to you than making money, then don’t include a non-commercial clause.
Not allowing derivative works allows you to retain the sole right to alter the work, which allows you to reserve the right to charge money for or prevent alterations. If allowing others to use and improve the work is more important to you that making money from or preventing alterations, then make sure you allow derivative works.
If you do not care about money, or controlling who is allowed to do what with the work (save put a copyright notice on it), but you do care that the work is free for all to use and modify how they see fit, then make sure the Creative Commons license you choose is a green one, with the ‘Approved for Free Cultural Works’ icon. This will ensure that your work receives the best chance of being re-used and shared by as many people as possible.