ppa2pup alternative approaches

I agree with issue 56, in that it is best if pkg is "no arch". Currently 0setup is much faster than ppa2pup and if things stay that we we can continue to use 0setup when it exists. However, I would like to look at possible ways to improve ppa2pup.

The first alterantive is to use awk. The following code was written by jamesbond in the woofnext branch:

function fixdepends(s) {
	split(s,a,","); s="";
	for (p in a) {
		gsub(/[ \t]*\(.*\)|[ \t]\|.*|:any/,"",a[p]); s=s "," a[p]
	}
	sub(/^,/,"",s); return s;
}
/^Package:/     { sub(/^Package: /,"");  PKG=$0; }
/^Version:/     { sub(/^Version: /,"");  PKGVER=$0; }
/^Filename:/    { sub(/^Filename: /,""); PKGPATH=$0; sub(/.*\//,""); PKGFILE=$0; }
/^Priority:/    { sub(/^Priority: /,""); PKGPRIO=$0; }
/^Section:/     { sub(/^Section: /,"");  PKGSECTION=$0; }
/^MD5sum:/      { sub(/^MD5sum: /,"");   PKGMD5=$0; }
/^Depends:/     { sub(/^Depends: /,"");     PKGDEP=fixdepends($0) "," PKGDEP; }
/^Pre-Depends:/ { sub(/^Pre-Depends: /,""); PKGDEP=fixdepends($0) "," PKGDEP; }
/^$/            { print PKG "|" PKGVER "|" PKGFILE "|" repo_url "/" PKGPATH "|" PKGPRIO "|" PKGSECTION "|" PKGMD5 "|" PKGDEP ;
                  PKG=""; PKGVER=""; PKGFILE=""; PKGPATH=""; PKGPRIO=""; PKGSECTION=""; PKGMD5="";  PKGDEP=""; }
'
			# remove duplicates, use the "later" version if duplicate packages are found
			< $REPO_DIR/$LOCAL_PKGDB > /tmp/t.$$ \
			awk -F"|" '{if (!a[$1]) b[n++]=$1; a[$1]=$0} END {for (i=0;i<n;i++) {print a[b[i]]}}'
			mv /tmp/t.$$ $REPO_DIR/$LOCAL_PKGDB
		fi
		if [ -z "$WITH_APT_DB" ] || [ $DRY_RUN ]; then rm -f $CHROOT_DIR$APT_PKGDB_DIR/"$apt_pkgdb"; fi
	done
}

/builders/deb-build.sh#L141

I notice that the fix version part of the code is considerably shorter (and perhaps more efficient) then the related code in ppa2pup:

    newpkgdeps="$(echo "$pkgdeps" \
      | sed \
        -e 's/  / /g'     \
        -e 's/ (= 2:/_/g' \
        -e 's/ (= 1:/_/g' \
        -e 's/ (<< /-/g'  \
        -e 's/ (= /-/g'   \
        -e 's/ (>= /-/g'  \
        -e 's/ (<= /-/g'  \
        -e 's/) / /g'     \
        -e 's/), /,/g'    \
        -e 's/)//g'       \
        -e 's/, /,/g'     \
        -e 's/ | /,/g'    \
        -e 's/| /,/g'     \
        -e 's/ |/,/g'     \
        -e "s/,$//g"      \
        -e 's/\.1~//g'    \
        -e 's/\.2~//g'    \
        -e 's/,/,+/g'     \
        -e 's/-2:/_/g'    \
        -e 's/-1:/_/g'    \
        -e 's/Depends: /Depends: +/g' \
      2>/dev/null)"

/usr/sbin/ppa2pup#L233

I do wonder whether or not jamesbonds fixdepends function functionally the same as the above series of sed commands.

Now for the other approach. It seems to me that rewriting all the package metadata to a separate file is a lot of extra writing that isn't required. Also using the grep command on each of these separate files is a lot of unnecessary searching.

Why not instead write directly to the new package db file. Also grep seems like overkill to me to match a constant string. I'm not sure if it has a significant performance cost or not but I wrote related code before that doesn't use grep.

while read -r -u15 -d $'\n' line; do
  #echo "line=$line"
  #read -p "Press enter to continue"
  case "$line" in
  'Package: '*)
    field=${line%%': '*}; val=${line#*': '}     
pet_specs_PKG_NAME="$val"

/woof-next/woof-code/rootfs-packages/sync_pet_specs_fm_dpkg/usr/bin/sync_pet_specs_fm_dpkg.sh

I think that I'll try jamesbond's approach first.

01micko actually suggested wring debdb2pupdb in awk but wdlkmpx "may" have ignored the suggestion!

Edited Nov 17, 2019 by s243a