Main Page | Class Hierarchy | Data Structures | File List | Data Fields | Globals | Related Pages

Frequently Asked Questions (FAQs)

FAQIndex

Why implement a new hash system?

Why implement a new hash system?

As noted in the introduction, the eD2k system is much too vulnerable to survive in an hostile environment like the one that is slowly starting to take shape and gain momentum. A stronger integrity check system with finer granuality are necessary in order to both reduce corruption waste and improve resilience against more subtle attacks.

What does MAFS offer to fill in the gaps?

The immediate advantages would be:

Are there any other advantages?

As collateral benefits, MAFS's scalable chunk size capability makes the following possible:

Are there any inconvenients?

There are a few things MAFS cannot do (yet):

How does it differ from popular systems?

I have limited knowledge of alternatives but here is how I think my implementation compares with the two I know:

eDonkey

9500KB chunks can only be verified against the chunk hash. When a chunk hash fails, the chunk is discarded and re-downloaded again and again until it hashes correctly. Assuming a malicious client on dial-up is doing network peppering at 4KB/sec and 200B sessions, this client would be capable of corrupting 20 chunks per second, representing corruption rates of 190MB/second, 680GB/hour, 16TB/day. Imagine what a well-funded and connected distributed peppering farm could do... even a small distributed farm would have the potential of killing the eDonkey network on which eMule is based: 10Mbps, 1KB/session = 1000 sessions per second = 9.5GB corrupted per second, 34TB per hour, 820TB per day per 10Mbps peppering farm. Such farms could effectively cause the majority of clients to re-download one or more of those 9500KB chunks ad-infinitum. Basically, the eDonkey legacy system is very much doomed. The relevant improvements from my system here are:

Bitzi/Merkle Tiger Tree

Although already a huge improvement over the eDonkey system, it assumes that Tiger hashes are and will remain unbreakable for the forseeable future. The tree is built from simple direct data hashes and each node is only aware of its two children nodes, meaning that a successful forgery (same hash, 'stealth corruption') of one data block or sub-tree top hash would be undetectable. The Bitzi SHA1 file hash would fail but the tree would be unable to locate the corrupted part. While the basic idea of a tree is neat, a simple tree is no stronger than the hashes used so if Tiger later turns out to be not as secure as it was thought, a simple Tiger Tree could become easy prey. To work around this possibility, my implementation goes a little extra distance:

Summary

I am sticking with the general tree structure because it is convenient, flexible and efficient. With the extra work I added to the basic tree, the structure as a whole should become much stronger than the hashes it uses, meaning the system should be able to survive even if its base hash algorithm becomes readily compromisable. Chances are that some people can already guess how I plan to do this since there is at least one obvious and simple way to do it and that is what I am working on. I probably could have reused most of the code from a Merkle implementation but since I do not have such code, I will simply keep on whipping up my own since it is already nearing completiion.

How long will it take?

I expect to start privately integrating and testing this new hash system with eMule in late 2003 at best. Right now, the most realistic timeline (after seeing how slowly things have been going lately) looks as follows:
  1. Implementation and API disclosure, already started back in September.
  2. Closed-source custom demo app for people to play with while I start modding the stock mule, late 2003.
  3. Closed-source limited pre-alpha modded client testing, Q1 2004.
  4. Semi-open source broader alpha modded client testing, late Q1 2004.
  5. Full-release open beta-testing, Q2 2004.
  6. Definitive implementation, maybe late H1 2004.

It seems my previous Xmas 2003 'most optimistic' estimate was quite far off! (or it may just be that I had an exceptionally crappy and distracting last month that threw my rythm completely off-track, in which case an Xmas alpha might still be possible if I get back on-track with full steam soon.)

Source-code related stuff

This section addresses issues related to how the source code is laid out and other source-code related things.

Why are there missing lines in the CPP and H files?

When Doxygen processes source files, it strips them from the the comment blocks used to generate the documentation you are seeing in these pages. It also optionally strips the standard C/C++ comments to leave only the code.

Other stuff...

This is the FAQ section where I dump questions that have too few related issues to earn their own separate FAQs section.

Why choose to mod eMule instead of some other open-source P2P client?

Well, I have been using eMule for most of the last two or so years. (I think the first version I used was 0.23a)

It did what I needed it to do often better than eDonkey. Because it is open-source, it also means I can change it to make it behave more like I want it to. Since it does most of what I need, is generally nice, open-source, etc. I can mod the mule I know or mod a different animal but I am probably too lazy to get involved in new breeds so I decided that it was not necessary to change client/P2P network and started modding the mule. My first (and only so far) significant change being Moonlight's Upload/Download Regulator (MUDR).

While I am stalled working on MAFS, I do random minor projects like Moonlight's Sparse File Support (MSFS) or other micro-mods (one or two GUI controls with not that many more source code lines) to make the mule a little more friendly or less wasteful. These tweaks (once there will be enough of them) will become the Moonlight's Mod.

This documentation is nifty, what are you using?

Most of what you are seeing is automatically generated by Doxygen (www.doxygen.org) from plain source code comments and a few simple control tags. For more information about Doxygen, go to http://www.doxygen.org.

Another good thing about doing it this way is that by documenting as you go, some things that can make you pause and do nothing come together naturally when you think about them while writing documentation and have to explain why one thing was done in a specific manner. Sometimes, this makes you realize that some functions were completely unnecessary and come up with a much better implementation.

Finally, documenting along the way also gives some indication of progress and makes it easier to jump back in the code after not working with it for a while. There are many more advantages but I think this is enough.

One setback of trying to write good documentation is that it takes roughly as much time to write as the code it covers.

Hits since December 5, 2003:
Count.cgi?display=counter|df=vllmtqtwVL2.dat

Generated on Fri Dec 5 04:42:55 2003 for Moonlight's eMule Tweaks Documentation by doxygen 1.3.4