How it Works

Every rpm has a header. That header contains a complete file list, package descriptions, lists of what features/libs it provides, lists of what it requires, what it conflicts with etc. In order for rpm to make a decision about what an rpm will need to be installed it needs the information in the header. Fortunately, this is all it needs. Many other updating tools use an index created of that information. They take the important information and send it to the client and use that to determine what should be installed. What Yum does is to copy the header from the rpms on the server (called a repository, just an HTTP or ftp server, nothing fancy or custom), then the client part of yum uses those headers to determine what needs to be installed/upgraded/erased. The benefit of using the rpm header is that yum can then rely on rpm to determine what should happen, because all the information is in a format native to rpm so no custom dependency calculation code is required. Another benefit is that this makes yum quite fast. And finally, by not writing custom dependency code yum keeps from reinventing the wheel and writing a parallel dependency engine to rpm. Rpm does all the hard work.

Headers

The rpm headers are typically a small part of the rpm. However, a whole bunch of them can get pretty big. All of red hat linux 7.3 (for example) is 37M. But that's for ALL the headers. Unless you're running a spartan system you'll probably have about a third to a half of those installed. In that case yum won't need to get the headers of the rpms you have installed. Those are already in the rpm database on your system. So on average you will have to store about 15-20M of headers. And you'll only have to download the ones you don't already have cached in your Yum cache directory. So on average update you'll probably have to download about 12KB->1M of headers. A pretty trivial amount.