Why software crashes

Programming applications for making music on Linux.

Moderators: MattKingUSA, khz

j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 360 times

Re: Why software crashes

Post by j_e_f_f_g »

j_e_f_f_g wrote:It's a total fallacy programmers have that malloc() won't return 0 due to the OOM.
raboof wrote:Uh, no.
Yes. It's a fallacy that malloc won't return 0. If malloc deduces that there's no way it can fullfill a request, for example due to memory fragmentation, exceeding a ulimit setting (or other mem management settings, such as overcommit_memory), etc, then malloc will return 0.

http://stackoverflow.com/questions/2248 ... -uses-over
http://voices.canonical.com/jussi.pakka ... -and-linux
http://compgroups.net/comp.unix.program ... ull/471850
raboof wrote:On Linux, due to memory overcommit, malloc() might not return NULL even if there's insufficient memory to back your malloc().
The key word being "might". You do realize that you're tacitly admitting that my above statement is true?
raboof wrote:allocating chunks of memory on a machine that doesn't have them available doesn't always make malloc() return NULL - which j_e_f_f_g above claimed was a 'total fallacy'.
That is not what I wrote. Reread my text which you quoted.
raboof wrote:Therefore I stand by my earlier claim that checking the return value of malloc() for a small number of small allocations is unlikely to improve the stability of your application when running on a typical Linux system.
And I stand by my claim that it should always be done, and that assumptions it's safe/pointless not to do it are incorrect.

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

User avatar
raboof
Established Member
Posts: 1865
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Has thanked: 52 times
Been thanked: 80 times
Contact:

Re: Why software crashes

Post by raboof »

j_e_f_f_g wrote:
j_e_f_f_g wrote:It's a total fallacy programmers have that malloc() won't return 0 due to the OOM.
raboof wrote:Uh, no.
Yes. It's a fallacy that malloc won't return 0. If malloc deduces that there's no way it can fullfill a request, for example due to memory fragmentation, exceeding a ulimit setting (or other mem management settings, such as overcommit_memory), etc, then malloc will return 0.
Let's not play word games. I agree there are cases where malloc will return NULL (even in the presence of overcommit). It's just really unlikely, especially when allocating such small chunks of memory on a typical machine (including something like a raspberry pi).
j_e_f_f_g wrote:
raboof wrote:On Linux, due to memory overcommit, malloc() might not return NULL even if there's insufficient memory to back your malloc().
The key word being "might".
Indeed, that's why I made it italic. I'm not claiming malloc() will never ever return NULL. I'm just claiming it does not noticeably improve stability to check for it in this case, because when it happens (which is in itself highly unlikely), you're in bigger trouble already.
j_e_f_f_g wrote:
raboof wrote:Therefore I stand by my earlier claim that checking the return value of malloc() for a small number of small allocations is unlikely to improve the stability of your application when running on a typical Linux system.
And I stand by my claim that it should always be done, and that assumptions it's safe/pointless not to do it are incorrect.
I think I've given a convincing argument on why it's fairly pointless to check the return value of malloc here from a stability point of view. I won't repeat it.
User avatar
nedko
Established Member
Posts: 13
Joined: Sat Feb 26, 2011 9:14 pm

Re: Why software crashes

Post by nedko »

What can be done to make linux memory management more reliable and predictable? I.e. to ensure that the linux audio program wont be killed by oom_kill.c. I guess we could allocate a memlocked memory and then use it for dynamic allocations. But how this or similar approach for avoiding the oom-killer be combined with virtual memory mechanisms that operates with CPU pages?
male
Established Member
Posts: 232
Joined: Tue May 22, 2012 5:45 pm

Re: Why software crashes

Post by male »

nedko wrote:What can be done to make linux memory management more reliable and predictable? I.e. to ensure that the linux audio program wont be killed by oom_kill.c. I guess we could allocate a memlocked memory and then use it for dynamic allocations. But how this or similar approach for avoiding the oom-killer be combined with virtual memory mechanisms that operates with CPU pages?
The first step is to prove that you have a problem in the first place, before attempting to solve it. Basically, if the OOM Killer is involved, something is very wrong. A similar situation will come up if you run out of disk space. There's simply no way to expect e.g. an HDR recorder to be able to deal with that 'gracefully'. The disk is full, if the task was to record audio, then the task has become an impossible one. What if the process that's leaking memory is the very same one that you don't want killed; e.g. an audio application such as Ardour? There is no solution other than to fix the leak. You can do all the workarounds you want, you can check for NULL return on malloc(), you can change the OOM Killer, you can flag some processes as unkillable, but whatever you do you can't change the fact that the system is out of memory and the only way to fix it is by releasing some--and even that's probably not going to fix it because a leak still exists and there's no way to know which process is responsible, and even if you could know, it might be the very process you'd most like to avoid killing.
Image
User avatar
nedko
Established Member
Posts: 13
Joined: Sat Feb 26, 2011 9:14 pm

Re: Why software crashes

Post by nedko »

male wrote: The first step is to prove that you have a problem in the first place, before attempting to solve it. Basically, if the OOM Killer is involved, something is very wrong. A similar situation will come up if you run out of disk space. There's simply no way to expect e.g. an HDR recorder to be able to deal with that 'gracefully'. The disk is full, if the task was to record audio, then the task has become an impossible one. What if the process that's leaking memory is the very same one that you don't want killed; e.g. an audio application such as Ardour? There is no solution other than to fix the leak. You can do all the workarounds you want, you can check for NULL return on malloc(), you can change the OOM Killer, you can flag some processes as unkillable, but whatever you do you can't change the fact that the system is out of memory and the only way to fix it is by releasing some--and even that's probably not going to fix it because a leak still exists and there's no way to know which process is responsible, and even if you could know, it might be the very process you'd most like to avoid killing.
I've seen "memory leak" in firefox killing the X server. Is this strong enough proof for you? I don't want memory leak in a non-critical application to kill mission critical application. Maybe I could just not start firefox, but this will only help if no other "evil" background process starts leaking. And most importantly, if I really wanted almost all the code being run on the CPU to be mine I could use something like DOS. But then again, my question was about what can be done to improve the situation on Linux and your answer does not provide any useful hint but tries to tell me that I should instead fix a random leaking app. Not useful at all, eh? :D
tramp
Established Member
Posts: 2434
Joined: Mon Jul 01, 2013 8:13 am
Has thanked: 11 times
Been thanked: 556 times

Re: Why software crashes

Post by tramp »

nedko wrote: But then again, my question was about what can be done to improve the situation on Linux and your answer does not provide any useful hint but tries to tell me that I should instead fix a random leaking app. Not useful at all, eh? :D
You can use mmap() and mlock() to ensure your critical data is in mem, but that wouldn't help when a "evil" process starts leaking. At least your X-server will block and you end up with a unusable system.
Indeed the only way to improve the situation for all linux users is, help fixing, or deleting from distributions, the leaking process.
The only alternative is, run a embed device were you control the available processes.
Fortunately those leaking processes be rare this day's. :)
On the road again.
tux99
Established Member
Posts: 346
Joined: Fri Sep 28, 2012 10:42 am
Contact:

Re: Why software crashes

Post by tux99 »

tramp wrote:Fortunately those leaking processes be rare this day's. :)
I don't think they are rarer than previously, it's just that we have so much RAM these days that it's hard to run out of it, so we don't notice memory leaks so much as before.
tramp
Established Member
Posts: 2434
Joined: Mon Jul 01, 2013 8:13 am
Has thanked: 11 times
Been thanked: 556 times

Re: Why software crashes

Post by tramp »

tux99 wrote:
tramp wrote:Fortunately those leaking processes be rare this day's. :)
I don't think they are rarer than previously, it's just that we have so much RAM these days that it's hard to run out of it, so we don't notice memory leaks so much as before.
You'll notice it when you control your mem usage (which you do constantly when you develop applications). From my experience they are rare, and if exist, mostly quick fixed.
On the road again.
User avatar
nedko
Established Member
Posts: 13
Joined: Sat Feb 26, 2011 9:14 pm

Re: Why software crashes

Post by nedko »

tramp wrote:You can use mmap() and mlock() to ensure your critical data is in mem, but that wouldn't help when a "evil" process starts leaking. At least your X-server will block and you end up with a unusable system.
Indeed the only way to improve the situation for all linux users is, help fixing, or deleting from distributions, the leaking process.
The only alternative is, run a embed device were you control the available processes.
Fortunately those leaking processes be rare this day's. :)
mlock() and mmap() don't help. I did some tests and the kernel kills my process. Looking at the kernel code it seemed that it should be possible with mlock() because it doesnt.

Without modifying kernel code i've come so far to two approaches:
1. try to estimate whether the big allocation that has to be locked in memory will succeed by inspecting current system memory usage.
2. allocate and lock the memory in a separate process and if the process doesn't get killed, map it to the address space of the process that actually needs it.

The problem with random apps eating the memory looks solvable on newer kernels by using cgroups and the memory resource controller.
tramp
Established Member
Posts: 2434
Joined: Mon Jul 01, 2013 8:13 am
Has thanked: 11 times
Been thanked: 556 times

Re: Why software crashes

Post by tramp »

I've avoided to enable cgroups in my kernels for now, but yes, reading the documentation it looks like it could be a solution for this special issue.
Please let us know if you do some tests with it.
On the road again.
Post Reply