New sfArk utility for Linux

Discuss anything new and newsworthy! See http://planet.linuxaudio.org for more Linux Audio News!

Announcements of proprietary software may fit better in the Marketplace.

Moderators: raboof, MattKingUSA, khz

j_e_f_f_g
Established Member
Posts: 1063
Joined: Fri Aug 10, 2012 10:48 pm

Re: New sfArk utility for Linux

Post by j_e_f_f_g »

sizeof(SfArkId) will return the size of the pointer, not the size of the string that the pointer points to
SfArkId is not a pointer. It's a 6 byte array. If the Mac version of GCC is reporting this array's size as 8, then it must be padding the array out to 8 bytes. But it should still report the size as 6. This seems like a mac gcc bug to me. It would be better to just change sizeof(SfArkId) simply to 6.

I don't do mac software, so I don't know as if there are other padding issues with mac gcc, but I'd try compiling the utility with a different mac compiler.

User avatar
raboof
Established Member
Posts: 1702
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Contact:

Re: New sfArk utility for Linux

Post by raboof »

j_e_f_f_g wrote:
sizeof(SfArkId) will return the size of the pointer, not the size of the string that the pointer points to
SfArkId is not a pointer.
Yes it is:

Code: Select all

static const char * SfArkId = ".sfArk";
It's a 6 byte array.
While arrays and pointers in C are closely related, 'sizeof' really does care how a variable is declared:

Code: Select all

  char array[10] = "asdfasdfx";
  char * pointer = array;

  printf("sizeof(pointer) = %d\n", sizeof(pointer));
  printf("sizeof(array) = %d\n",   sizeof(array));
will (even on i686 linux) yield:

Code: Select all

sizeof(pointer) = 4
sizeof(array) = 10
If the Mac version of GCC is reporting this array's size as 8, then it must be padding the array out to 8 bytes. But it should still report the size as 6. This seems like a mac gcc bug to me. It would be better to just change sizeof(SfArkId) simply to 6.
1) With your code, it never returns 6, it returns 4 or 8 depending on architecture.
2) I'm pretty sure that's just how sizeof is specified, though I don't have a copy of the spec handy.

So this is a bug in unsfark, fixed it in https://github.com/raboof/unsfark/commi ... 142e98cdfd (though there might be more like this lurking, haven't checked).

We need a testsuite. I tried some of the soundfonts from http://www.soundfonts.gonet.biz/list.php but so far none of them convert correctly:

* Splendid_72M.sf2: works but crackles badly
* Cello Section.sf2: works but crackles
* tgk3.SF2: does not load at all
* 031.9mg reality gm gs bank.sfark: fails to unsfark (bad checksum)
* 027.3mg symphony hall bank.sfark: fails to unsfark (bad checksum)
* GUITAR-NYLON-PROTRAX.SF2: fails to unsfark (bad checksum)
* Piano Bechstein.sfArk: fails to unsfark (bad checksum)

Do you have any public domain sfArks that do convert correctly? And do you know how i can unpack an sfpack?

ssj71
Established Member
Posts: 1292
Joined: Tue Sep 25, 2012 6:36 pm

Re: New sfArk utility for Linux

Post by ssj71 »

First I 'd like to say that I think what J_e_f_f_g did is some pretty awesome reverse engineering work, and even his lampooning helps me be a little bit of a better programmer.

But... if there are licensing issues or anything there is another option for getting at sfark soundfonts. While I also disagree with proprietary formatting and hope sfark does die, Melody Machine's sfark utility is still freely available and runs perfectly under wine. http://www.melodymachine.com/sfark.htm

Linux is about freedom and choice right? I'm not a big winer but it worked. I'm pretty sure it was the first program I installed and ran with wine. Anyway, just want to get the facts out there. Hopefully unsfark proves triumphant at unsfarking the universe.

_ssj71
_ssj71

music: https://soundcloud.com/ssj71
My plugins are Infamous! http://ssj71.github.io/infamousPlugins
I just want to get back to making music!

Maxime
Posts: 1
Joined: Tue Nov 27, 2012 8:30 pm

Re: New sfArk utility for Linux

Post by Maxime »

Hello,

I'm the author of Arachno SoundFont, released in 2010 as an .sfArk file. So, you can count me among all the "morons" who've used it to compress their banks!

I chose sfArk over all the other formats I tried because:
- it was the most efficient, compressing Arachno SoundFont from its original 148,20 MB size down to 70,55 MB;
- it was the only specialized compression format I could find that was available under the three major platforms (Windows, Mac OS X, Linux).

The cross-platform issue was a real concern, so as soon as I confirmed that decompressors were available on alternative OSes, I chose sfArk without any other hesitation.

But... I wasn't actually aware about the facts that the Mac and Linux decompressors were really outdated.
According to some Arachno SoundFont users' reports, the Linux port requires some odd libraries to run, and the Mac OS X port only works on PowerPC architectures (although I was barely sure that I tried it on my now deceased MacBook Pro - well).

So, thanks for pointing out all the crap stuff of this format, and congrats for all your hard work.

Now, I'd like to ask you some questions:
- I'd like to include your sfArk utility in my Arachno SoundFont zip archive as another helper for Linux users. Hope you'll agree with that!
- is there any compression format you may suggest to use for SoundFonts? Given that:
* both ZIP and 7z offer mediocre to average compression ratios with my SoundFont - 128 MB and 109 MB, respectively;
* that SFPack is Windows-only and unsupported as well (that's why I didn't choose it)
* that RAR is a proprietary format you'd like to avoid, obviously.

Also, did you get any feedback of users who may have succeeded in compiling your software under Mac OS X as a Universal Binary app for all Mac OS X users?

Many thanks for your experience and help.

Regards from France,
Maxime

j_e_f_f_g
Established Member
Posts: 1063
Joined: Fri Aug 10, 2012 10:48 pm

Re: New sfArk utility for Linux

Post by j_e_f_f_g »

RAR is technically proprietary, but the algorithm is known and the author has allowed its use in freeware programs that are well-supported on windows, mac, and linux. And unlike sfarc, algorithms like zip and rar are properly written to not introduce rounding errors. (sfarc as written is a lossy format).

You can achieve similiar results by just doing what sfarc does -- double compression. Many folks pack up some files in a rar archive, and then zip that rar. But this is done when the source is many gigs. Doing this to save a couple meg is kind of pointless.

I don't care what you do with the utility, but its ultimate purpose is to help eradicate sfarc -- not facillitate it.

There are about 10 different variations on the sfarc format. (It really is awful). I found and tested with sfarc files in v2's various "turbo" and "fast" algorithms, as well as v1 files. Those work. There appears to be a bug with the "standard" algorithm. I don't know when I'll get a chance to look at that. (The bug with the id string was introduced by someone else. My original, working code was an array, not a pointer. That shouldn't have changed.) Also, I discovered that Swami doesn't like v1 files. It loads them blank. Vienna doesn't.

User avatar
raboof
Established Member
Posts: 1702
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Contact:

Re: New sfArk utility for Linux

Post by raboof »

j_e_f_f_g wrote:The bug with the id string was introduced by someone else. My original, working code was an array, not a pointer. That shouldn't have changed.
Blimey, you're right - I introduced that in https://github.com/raboof/unsfark/commi ... 80f2413d79 . Sorry about the noise, my bad.

andy2i
Established Member
Posts: 9
Joined: Tue Jan 08, 2013 12:45 pm

Re: New sfArk utility for Linux

Post by andy2i »

And now that that is over and done, and for anybody that might find it interesting, here are a few actual facts...

* The original version of sfArk was available in 1997 and generally released in 1998. It was intended as a fairly quick fix for a significant problem - large audio files and slow internet connections.

* Use of LSPACK was only in sfArk 1.x (1998) and dropped in 2.x (2001) specifically because LSPACK was only available for Windows. LSPACK was a commercially available ZLIB-style library. ZLIB itself was not in such wide use at that time either, but of course became something of a standard later, so switching to ZLIB for sfArk 2.x was a sensible move.

* The decision to use Intel floating point maths was reasonable at the time, based on the fact that even in 2001 SoundFonts were used almost exclusively on Intel/Windows. Their migration to Mac came later, and Linux later still. Nonetheless, there is no insurmountable problem with this approach, just additional complexity.

* "rounding errors" are not present "in" the sfArk file. All the floating point calculations done by sfArk relate to making a prediction of future values, which by definition, is always "wrong" (it would still be "wrong" even if integer calculations were used). However, the fact that different architectures do floating point maths differently is a complication, meaning that care is needed when coding a compatible decompresser. Now with the benefit of 20/20 hindsight it would probably have been a good idea to develop integer versions of the relevant algorithms, scaling up intermediate values to avoid loss of precision, especially now that 64-bit machines are available. In fact, sfArk 3 (never released) did just that - it used improved prediction algorithms (faster and more accurate) and pure integer calculations.

* The sfArk compression algorithm, even when floating-point calculations are used is lossless - all relevant information needed to regenerate the original data is included in the output .sfArk file. If your decompression program doesn't decode it properly, then that's because of bugs or limitations in your decompression program (or possibly an unreported bug in the original sfArk compression program). If you have a .sfArk file that decompresses properly under sfArk for Windows but not with some third-party decompression program then clearly that third party program is not properly replicating the behaviour of the original sfArk utility. QED.

* sfArk was not "written by a programmer who didn't know what he was doing". Oh wait, I'm not sure about that one :) But seriously, it was just a tool to do a job, one that was state-of-the-art at its time and the result of a lot of original research. Newer developments such as Monkey's Audio and more recent versions of FLAC can produce better results, but only by a small margin. Josh Green, who wrote FLAC said back in 2003 "I'd love to beat sfArk" - http://lists.xiph.org/pipermail/flac-de ... 01463.html . A while ago (March 2011) I asked Josh if he would like take on support for sfArk decompression in FLAC, but he was too busy.

* A short while ago (and prior to having found this thread here) I posted a comment at https://aur.archlinux.org/packages/sfarkxtc/ asking for anybody interested in maintaining sfArk support under Linux to get in touch. That offer is still very much open (a couple of people have expressed an interest so far.) Anybody here interested, who has both the programming skills AND adequate emotional intelligence (hint) please PM me.

User avatar
raboof
Established Member
Posts: 1702
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Contact:

Re: New sfArk utility for Linux

Post by raboof »

andy2i wrote:* "rounding errors" are not present "in" the sfArk file. (...) However, the fact that different architectures do floating point maths differently is a complication, meaning that care is needed when coding a compatible decompresser.
What i *think* Jeff is seeing as 'rounding errors' here is that he appears to have taken a binary (possibly compiled with architecture-specific optimizations) and disassembled that. That'd easily produce code that doesn't work accurately on other architectures. Just guessing here though.
andy2i wrote:A short while ago (and prior to having found this thread here) I posted a comment at https://aur.archlinux.org/packages/sfarkxtc/ asking for anybody interested in maintaining sfArk support under Linux to get in touch. That offer is still very much open (a couple of people have expressed an interest so far.)
Cool! I might (though I can't currently invest serious amount of time, perhaps i can help a bit here or there).

andy2i
Established Member
Posts: 9
Joined: Tue Jan 08, 2013 12:45 pm

Re: New sfArk utility for Linux

Post by andy2i »

raboof wrote: What i *think* Jeff is seeing as 'rounding errors' here is that he appears to have taken a binary (possibly compiled with architecture-specific optimizations) and disassembled that. That'd easily produce code that doesn't work accurately on other architectures. Just guessing here though.
I've already sent you this by PM, but here again for others....

Quickly re. the floating point issue. I *think* all that is needed is to force the compiler to use "double precision" floating point calculations - I had a similar issue when I did the Mac version. The compressor calculates a prediction and then subtracts that from the actual data value, storing only the difference (which is usually a smaller value, so can be output in fewer bits, hence the compression.) The decompressor needs to calculate the same predicted value and add it back to the given difference to obtain the original value. But due to difference in the FP maths (not, strictly speaking, a rounding *error*) that prediction could have a different value than that used by the compressor. Well, you probably know all this :) Using double precision cured the problem on the Mac version, anyway, and as far as I know the old Linux decompressor worked fine, but maybe there is more to it than that - in a nutshell, somehow the decompressor needs to regenerate the same predicted value that was used by the compressor, but that's got to be possible!

... and just to add, I was talking about a Motorola Mac there. So, in the worst case you might need to tweak compiler options to make it work on other systems, and in the worst case add code to emulate what happens on Intel/Windows but like I said, it's got to be feasible, even if a little messy, to write a reliable decompressor for Linux. Basing it on the original source code should help rather a lot too!

j_e_f_f_g
Established Member
Posts: 1063
Joined: Fri Aug 10, 2012 10:48 pm

Re: New sfArk utility for Linux

Post by j_e_f_f_g »

Use of floating point is not itself the issue (as long as an adequate precision is chosen). The rounding errors (that result from using different floating point units) are due to sfark's repeated conversions between integer and floating point. You are almost assured to get rounding errors in such a scenario. So a programmer should do such conversions only if small rounding errors are irrelevant, which they are not in what is supposed to be a lossless compression algorithm. There's a good reason why, if you look over the source to zlib for example, you won't see such conversions done.

No, simply using 'double' instead of 'float' doesn't solve the issue. The issue isn't precision. The issue is too many int to float conversions.

The sfark format should no longer be used nor supported. It has been made obsolete by more standardized, well-documented formats that have code bases that cleanly compile and run on various architectures.

andy2i
Established Member
Posts: 9
Joined: Tue Jan 08, 2013 12:45 pm

Re: New sfArk utility for Linux

Post by andy2i »

j_e_f_f_g wrote:... a programmer should do such conversions only if small rounding errors are irrelevant, which they are not in what is supposed to be a lossless compression algorithm.
That would be true if the source data were converted between integer and floating point - it is not - only the prediction values. In a nutshell:

Code: Select all

float prediction;
int input, output;
...
prediction = get_predicted_value_as_float();  // Do all kinds of weird stuff here.
output = input - (int) prediction;  // and keep *real* data as integer - this is lossless.
What you seem to be referring to is the integer/float conversions done for the purpose of getting a predicted value. Possibly you have not understood the purpose of those parts of the system. Possibly you have been looking at an unstable/buggy version of the compressor/decompressor - there were all kinds of alphas and betas made available and there may be .sfArk files out on the net that were created by one of them. But the released versions are pretty solid as far as I know. As I indicated above - run a file through the Windows version (it works under Wine too), compress/uncompress and do a binary compare of the original with the regenerated sf2. Lossless. QED (possible bugs excepted).
j_e_f_f_g wrote:There's a good reason why, if you look over the source to zlib for example, you won't see such conversions done.
Indeed there is a good reason - LZ-style algorithms don't need to do any kind of linear prediction. But that's completely irrelevant in this case.
j_e_f_f_g wrote:The sfark format should no longer be used nor supported.
I don't agree - there is an evident need in the community for a decompress utility. Yes, it would be a smart idea for everybody to decompress their sfArk files one last time, and then recompress them with something else, if required. But they will still need a working sfArk decompress utility to do that.
j_e_f_f_g wrote: It has been made obsolete by more standardized, well-documented formats that have code bases that cleanly compile and run on various architectures.
Agreed, but not the examples you refer to (zlib, gzip, etc). They don't even begin to compete with sfArk in effectiveness. But, as I also indicated in my prior post, modern alternatives such as FLAC (recent versions) and Monkey's Audio are good replacements. Use one of those, tell them that the source is "raw" 16-bit signed audio, and you will get compression ratios similar to sfArk and much better than gzip, 7zip or whatever.

sfArk has also become largely obsolete due simply to the fact that 56k modems and pay-per-minute/byte internet connections are dead and gone (although now we have mobile...)

Jeff, for somebody so apparently concerned with doing things the "right" way, one has to wonder why you never tried to get in contact with me.

j_e_f_f_g
Established Member
Posts: 1063
Joined: Fri Aug 10, 2012 10:48 pm

Re: New sfArk utility for Linux

Post by j_e_f_f_g »

andy2i wrote:What you seem to be referring to is the integer/float conversions done for the purpose of getting a predicted value.
Yes. But since what is stored derives from this "predicted value", then if the decompressor's calculations differ from the compressor's, then you have what is for all practical purposes a lossy decompression because the resulting soundfont will be different from the original.

Here's a simple test. Take your command line decompressor source code, compile it with a recent version of Microsoft's visual studio (should compile easily) which will use intel's SSE instructions for your decompressor. Now try to decompress an sfark made with the old compressor (which does not use SSE instructions). What do you get?
need in the community for a decompress utility.
Yes, but that can be done simply by open sourcing the command line decompressor sfarcxtc. That should be all that's needed.
one has to wonder why you never tried to get in contact with me.
Very simple: at the time i first needed to decompress sfark files (which is around the time this thread started), the url for melodymachine.com returned a 404 error. This occurred for the entire week I tried. Furthermore, I read discussion board messages from others, posted even earlier, that the site had been 404 for awhile. And the sole "contact info" was an email address to this inaccessible domain.

andy2i
Established Member
Posts: 9
Joined: Tue Jan 08, 2013 12:45 pm

Re: New sfArk utility for Linux

Post by andy2i »

j_e_f_f_g wrote:But since what is stored derives from this "predicted value", then if the decompressor's calculations differ from the compressor's, then you have what is for all practical purposes a lossy decompression because the resulting soundfont will be different from the original.
What you are describing is simply a flawed decompressor. "Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data" - http://en.wikipedia.org/wiki/Lossless_compression - that doesn't mean that a working decompression utility would be simple to produce. What matters is that it is possible to decompress correctly - even if you had to emulate in software the floating point calculation methods used by the compression routine.
Here's a simple test. Take your command line decompressor source code, compile it with a recent version of Microsoft's visual studio (should compile easily) which will use intel's SSE instructions for your decompressor. Now try to decompress an sfark made with the old compressor (which does not use SSE instructions). What do you get?
What I would get is a flawed decompression utility. Please see above.
Yes, but that can be done simply by open sourcing the command line decompressor sfarcxtc. That should be all that's needed.
Indeed. And nobody here is suggesting else, as far as I can see.
Very simple: at the time i first needed to decompress sfark files (which is around the time this thread started), the url for melodymachine.com returned a 404 error. This occurred for the entire week I tried. Furthermore, I read discussion board messages from others, posted even earlier, that the site had been 404 for awhile. And the sole "contact info" was an email address to this inaccessible domain.


By "inaccessible domain" you seem to mean inaccessible web-site. I'm sure you know that SMTP (email) services are independent of HTTP (web) services. And besides, you have WHOIS that would have given you an administrative contact email address (not surprisingly, my email address) which works and was working at that time. As a Linux guy, you must know about the whois command, right?

So, it seems you certainly didn't try very hard to contact me, which, considering your plan was to "reverse engineer" the software, which is specifically prohibited under the original license agreement, does not strike me as the right way to go about things.

For reference: THE LICENSEE agrees not to cause or permit any attempts to determine the source code of the Software, including but not limited to reverse engineering, disassembly or decompilation of the Software, except as permitted by Article 6 of the EU Software Directive, and agrees not to cause or permit any modification, adaptation or reprogramming of the Software.


Right now I'm not too bothered about the legal implications - as I've already indicated, I had already offered the source for inclusion in FLAC almost two years ago. But open source is not, and never was, about blatantly ignoring others' legal and moral rights to ownership.

I'm happy to let you use your own definition of what is and isn't "lossless compression", I simply don't agree with your definition, and I don't think many other people would either. But what does it matter? Can't we get on with something more productive?

You think sfArk files should be eradicated from the web. Have I disagreed? You explained your points in your opening post to this thread. I have explained mine - http://linuxmusicians.com/viewtopic.php ... =15#p35712 and elsewhere here. The issue now is how to most effectively provide a solution to those people who want to be able to download an use sfArk compressed soundfonts. That was my sole objective for searching for discussions on this topic, how I arrived here, and my primary objective in joining this thread, plus setting the record straight. What are your objectives in this context Jeff, and how do they differ from mine?

j_e_f_f_g
Established Member
Posts: 1063
Joined: Fri Aug 10, 2012 10:48 pm

Re: New sfArk utility for Linux

Post by j_e_f_f_g »

andy2i wrote:What you are describing is simply a flawed decompressor.
Take your command line decompressor source code, compile it with a recent version of Microsoft's visual studio which will use intel's SSE instructions for your decompressor. Now try to decompress an sfark made with the old compressor (which does not use SSE instructions). What do you get?
andy2i wrote:What I would get is a flawed decompression utility.
I know. Now I'm just wondering why you think that isn't a design flaw.
simply open source the command line decompressor sfarcxtc. That should be all that's needed.
andy2i wrote:Indeed. And nobody here is suggesting else, as far as I can see.
So then why not do that? You have a domain hosting an executable download. Why not simply post a zip download containing the source to sfarcxtc.c, and a copy of the GPL? What is the point of looking for a linux programmer, and only then, showing this specific person the source? Presuming you do intend to use the GPL for any resulting decompressor, you do realize that the latter action is tantamount to the former?
andy2i wrote:your plan was to "reverse engineer" the software
Only after I tried running the "linux version" of sfark (which I found via a link to an archive named sfarkxtc_lx86.tar.gz containing a single elf executable) and it immediately bombed out with a message saying it wasn't compiled for my architecture (64-bit linux). There was, and still is not today if you don't count my effort, such a working linux solution.
andy2i wrote:The issue now is how to most effectively provide a solution to those people who want to be able to download an use sfArk compressed soundfonts.
There should be no issue (assuming your declarations about wanting it open-sourced, and even included in OSS like flac, are earnest). Post a zip download containing the source to sfarcxtc.c, and a copy of the GPL. That's all you need to do to go open source (assuming your source isn't patent-encumbered).

All this chest-beating about a reverse-engineered utility that actually does exactly what you claim you want to do, inexplicable reluctance to simply open-source the code, expecting programmers to contact you (via whois no less) before they do anything, makes me wonder exactly what you have in mind, and/or if you have a completely different concept of open source?
andy2i wrote:What are your objectives in this context Jeff, and how do they differ from mine?
My objective was to get an sfark decompressor that runs under 64-bit linux. Done.

Can't say how that differs from yours because yours are unclear to me. You claim to want to open-source it, but haven't done so for unknown reasons. I don't know what you're after.

User avatar
raboof
Established Member
Posts: 1702
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Contact:

Re: New sfArk utility for Linux

Post by raboof »

j_e_f_f_g wrote:You have a domain hosting an executable download. Why not simply post a zip download containing the source to sfarcxtc.c, and a copy of the GPL? What is the point of looking for a linux programmer, and only then, showing this specific person the source? Presuming you do intend to use the GPL for any resulting decompressor, you do realize that the latter action is tantamount to the former?
I think it's a good idea to make sure releasing the source code is 'done right'. Andy asked for help with that, and some people have stepped up to provide that. If no-one else beats me to it, I'll put up a properly licensed version on github about 2 weeks from now - right now, I'm packing to go skiing first (yay :) ).
There should be no issue (assuming your declarations about wanting it open-sourced, and even included in OSS like flac, are earnest)
I can vouch they're earnest.
Post a zip download containing the source to sfarcxtc.c, and a copy of the GPL. That's all you need to do to go open source
Not if you want to do it 'right' - see, for example, http://www.gnu.org/licenses/gpl-howto.html . It's not particulary hard, but it's some work.

Post Reply