index updates via diff-only files
In order to dramatically speed up the index update, there should be a mode where the client can download only the changes since it last updated. The size of the daily changes should be a tiny percentage of the full index. I think the easiest way to handle this is if the diff format is just the same index XML, but it includes only the changes. This will easily represent changes and additions, but perhaps not subtractions. For example, the full index is currently wrapped in <fdroid></fdroid>
. The diffs could use blocks like<additions></additions>
and <subtractions></subtractions>
, then the contents of those blocks would be the same as in <fdroid></fdroid>
, but would of course trigger additions and subtractions respectively. For example, if a repo changes name, it would look like:
<?xml version="1.0" ?>
<additions>
<repo name="Guardian Project Official Releases">
</repo>
</additions>
If an APK is deleted, it would look like:
<?xml version="1.0" ?>
<subtractions>
<application id="info.guardianproject.cacert">
<package>
<version>0.0.2.20111012</version>
<versioncode>4</versioncode>
<apkname>CACertMan-0.0.2-alpha-20111011.apk</apkname>
<hash type="sha256">251ebd40ce4a281a2292692707fb1e9c91428994cbad80a416a297db51069eb8</hash>
<sig>a0eeebb161f946e3516945fae8a92a3e</sig>
<size>172263</size>
<sdkver>7</sdkver>
<added>2014-06-30</added>
<permissions>ACCESS_SUPERUSER</permissions>
<features>android.hardware.touchscreen</features>
</package>
</application>
</subtractions>
My worry about this approach to subtractions is that it would be hard to tell whether to just remove elements from an APK or app, or delete the entire thing. A delete command might be needed then, e.g. to delete all info about an app:
<?xml version="1.0" ?>
<deletions>
<application id="info.guardianproject.cacert">
</application>
</deletions>
Or to delete a single APK:
<?xml version="1.0" ?>
<deletions>
<application id="info.guardianproject.cacert">
<package>
<hash type="sha256">251ebd40ce4a281a2292692707fb1e9c91428994cbad80a416a297db51069eb8</hash>
</package>
</application>
</deletions>
The logic of fetching the diff indexes would go something like this. First, fdroidserver would generate the index and diffs on the file system, with diffs being labeled using the timestamp of the index file that they update:
- index.jar - the full index
- index-1456598273.jar - the changes since 1456598273
- index-1456531016.jar - the changes since 1456531016
The client would take <repo timestamp="1456598273">
and generate the filename to fetch. If that file did not exist, the client then fetchs index.jar.