Announcement

**DavidBrown** · 08 September 2022, 09:48 AM

Originally posted by kylew77 View Post

Pointers in general they thought it was crazy that we could do array[n] and then do *(array + n), but I would also say that it was hard to teach that arrays of variable size could be allocated with malloc and then dynamically grown. In my personal opinion we taught what I thought should be CS 1 and CS 2 as part of CS1, my small country regional university stopped CS 1 with arrays, regular arrays, pointers and such were saved for CS 2. We didn't learn about data structures until CS 3 / Data Structures. I was teaching these poor kids about linked lists the last week of class!

When I was doing my CS PhD / masters I was doing research in CS education and what the best first language is, it was some fascinating stuff, I left before I got anything published though.

Ask for your money back from the university - if they taught you C in your first year, it was a C coding class, not a computer science degree.

**DavidBrown** · 08 September 2022, 10:07 AM

Originally posted by coder View Post

Not sure I agree that assembly is necessary for learning "efficient programming". C is good enough, for that.

I have seen many people who have programmed in C for a long time, who have no clue as to what happens underneath. You can easily write C code that works correctly, without learning how to get efficient results - especially if you ever work with different classes of processor. I have seen people write "x * 0.5" to divide an integer by 2, and then wonder why their program is so slow on an 8-bit microcontroller. Equally, I have seen people who do have some understanding about what is going on underneath, but have no understanding of compilers, write "(x << 2) + x" and claim it is faster than "x * 5". (If that happens to be the fastest way to implement multiply by 5 on the particular target, it's the compiler's job to generate that.)

Yeah, the main thing assembly language programming teaches you is how much even a C compiler does for you. Register allocation is a huge headache, if you have to do it by hand. Something with lots of general purpose registers, like ARM, should be good for dabbling. That way, you shouldn't have to redo register allocation every time you make a little code change.

A major criticism of ARM is that it doesn't have that many general-purpose registers - only 12 (for 32-bit ARM). That's enough to play with, but can quickly be limiting in real code. Keeping track of what data is in registers, what is in stack slots, and how they move between them is where compilers shine, and human programmers have a lot more difficulty.

When I was writing SSE code, I used the C intrinsics but I'd check the compiler output, to make sure it wasn't generating lots of extraneous instructions. I did find a few surprises, that way! In the end, I got reasonably close to the efficiency of hand-coded assembly, if not better, and with a lot fewer headaches.

Using intrinsics can have its advantages and disadvantages. You need to be very careful about how the compiler can re-arrange code - this can lead to better pipelining and scheduling, but can also lead to trouble if the programmer expects the resulting assembly to match the source code directly. Getting such code right, efficient, and suitable for different processors in the same family is a fine art!

**coder** · 08 September 2022, 10:21 AM

Originally posted by kylew77 View Post

throwing assignment like x = 5; and conditions, and basic scanf and printf functions and preprocessor directives all into week 1 is way too much in my opinion!

I wouldn't touch the preprocessor, other than to say "Put these #include's at the top of your program. You'll understand why, later."

Originally posted by kylew77 View Post

What haunts me to this day is we had a woman take the class 3 times and try her hardest and not pass each time and she had to have the class for her engineering degree. She was in mechanical or civil or something like that, someone unlikely to ever need to code in C, but the university insisted that she learn to code in C.

That's grim. A small part of me does think "gee, if someone has such a hard time grasping these concepts, can they really be such a good engineer?", but I'm sure that's my cognitive bias speaking and I don't know how much stress she was under from other classes or responsibilities. Mathematicians would probably have similar thoughts about some of the areas that give me trouble.

You'd wish that someone could find additional resources before taking a class the second or 3rd time, but again I don't know her circumstances. If nothing else, there have got to be some good Youtube tutorials about this stuff.

Originally posted by kylew77 View Post

This is all ancient history now 2016 and 2017.

I know some colleges and universities were quick to jump on the Java bandwagon, way back in the late 90's.

**coder** · 08 September 2022, 10:28 AM

Originally posted by DavidBrown View Post

Leak a semaphore, mutex or a file handle, or other kinds of resource, and you could be in far more trouble. So when learning programming, learn to take care of your resources, and then memory management is peanuts

Uh, but the GC languages I know all use objects to manage those other resources, as well. So, garbage collection saves you in those areas, unless you keep an explicit reference to an object, longer than you should.

**coder** · 08 September 2022, 10:44 AM

Originally posted by DavidBrown View Post

I have seen many people who have programmed in C for a long time, who have no clue as to what happens underneath. You can easily write C code that works correctly, without learning how to get efficient results - especially if you ever work with different classes of processor. I have seen people write "x * 0.5" to divide an integer by 2, and then wonder why their program is so slow on an 8-bit microcontroller.

Well, that's not the fault of using C. You can learn about how CPUs work, and still just use C to program them.

One of the first tricks I learned to make my MS Quick BASIC programs go faster was to dimension my variables as ints.

Originally posted by DavidBrown View Post

Equally, I have seen people who do have some understanding about what is going on underneath, but have no understanding of compilers, write "(x << 2) + x" and claim it is faster than "x * 5". (If that happens to be the fastest way to implement multiply by 5 on the particular target, it's the compiler's job to generate that.)

Obviously, learning assembly language isn't going to teach you that. What's needed is to learn about optimizing compilers. Or simply compiler optimizations.

Originally posted by DavidBrown View Post

A major criticism of ARM is that it doesn't have that many general-purpose registers - only 12 (for 32-bit ARM).

AArch64 extends the GPRs to 31+ SP/zero.

Originally posted by DavidBrown View Post

Keeping track of what data is in registers,

I would generally to use macros to map variables to registers. I'd define them at the point of "allocation" (i.e. first use) and undefine them at the point of "deallocation" (i.e. last use).

Originally posted by DavidBrown View Post

Using intrinsics can have its advantages and disadvantages. You need to be very careful about how the compiler can re-arrange code - this can lead to better pipelining and scheduling, but can also lead to trouble if the programmer expects the resulting assembly to match the source code directly. Getting such code right, efficient, and suitable for different processors in the same family is a fine art!

I only went as far as making sure the compiler didn't generate significantly more instructions than I expected. Fortunately, I didn't need to worry about whether the code was optimized to within the very last %. Just doing a reasonably efficient vectorization + ensuring decent cache utilization was enough to hit my performance targets.

Oh, and restricted pointers. Anyone doing code optimization in C or C++ ought to understand them and when to use them. Although C++ doesn't officially support them, every C++ compiler supports them as a nonstandard extension, because they're that important.

I think the reason the C++ standards committee doesn't like them is that they will break your code, if used improperly. Worse, the breakage is likely dependent on compiler optimization level, which makes usage errors even harder to debug.

**DavidBrown** · 08 September 2022, 02:15 PM

Originally posted by coder View Post

Uh, but the GC languages I know all use objects to manage those other resources, as well. So, garbage collection saves you in those areas, unless you keep an explicit reference to an object, longer than you should.

You absolutely do not want to manage your synchronisation primitives and locks via asynchronous garbage collection! Your PC has lots of memory - it usually doesn't really matter when it gets returned to the free pool. But you want your locks to be taken when you ask for them, in the order you ask for them, and to be released when you ask to release them, in the order you ask (which is almost always the reverse order from acquisition). You don't say "release this lock some time, whenever it suits and you have nothing else to do".

There might be an object for managing the lock, and the memory for that can be garbage collected. But not the lock.

So when you use a garbage collected language like Python, you use a "with" statement to control the lock - you don't rely on garbage collection.

(Note that C++ style RAII is entirely different. There you have precise semantics about the order and time when objects are destructed, and therefore when the lock gets released.)

**coder** · 08 September 2022, 02:46 PM

Originally posted by DavidBrown View Post

you want your locks to be taken when you ask for them, in the order you ask for them, and to be released when you ask to release them, in the order you ask (which is almost always the reverse order from acquisition). You don't say "release this lock some time, whenever it suits and you have nothing else to do".

First, that's a little bit of cherry-picking from what you originally said. Your original list was rather open-ended, and now you're just singling out locks. For instance, there are many cases where file handles are merely used to manage a resource, rather than mapping to an actual file that you want to close promptly because someone else might open it. Take, for instance, a poll fd or eventfd.

Yes, you will typically want to make explicit unlock calls, to reduce latency. I would still want GC to release a mutex as a fallback, in case I forget.

Third, unlock order doesn't matter the same way that locking order does. If you lock in the wrong order, a deadlock can occur. However, unlocking in the wrong order won't cause a deadlock, as long as all of the locks are eventually released.

**DavidBrown** · 08 September 2022, 03:12 PM

Originally posted by coder View Post

First, that's a little bit of cherry-picking from what you originally said. Your original list was rather open-ended, and now you're just singling out locks. For instance, there are many cases where file handles are merely used to manage a resource, rather than mapping to an actual file that you want to close promptly because someone else might open it. Take, for instance, a poll fd or eventfd.

Yes, you will typically want to make explicit unlock calls, to reduce latency. I would still want GC to release a mutex as a fallback, in case I forget.

Third, unlock order doesn't matter the same way that locking order does. If you lock in the wrong order, a deadlock can occur. However, unlocking in the wrong order won't cause a deadlock, as long as all of the locks are eventually released.

Sure, some resources can certainly be released at some unspecified time later in the future. Basically, if the resource is available in quantity, or is not going to be used again while the program is running (as may well be the case for some files), management by garbage collection is fine. If the resource might be contested, or there might be other use for it, then it is not fine. When considering the correctness of a program, you usually consider that garbage collection never actually happens, or happens far in the future, since that's the worst case. If you are using synchronisation mechanisms between threads or processes, that is simply unacceptable. For memory, it is rarely an issue. For something like files, it might well be - it could be the closure of the file handle that leads to the data being committed to the disk, and users might not be happy to press "save" only to find the file is not saved until some indeterminate time in the future.

There's no advantage to garbage collection for that kind of thing, and clear and obvious disadvantages (unlike memory, for which garbage collection can be a definite efficiency win as well as being very convenient). That's why with modern garbage collected languages, you generally do not use garbage collection for handling resources other than memory - you use try/finally blocks in Java, "with" statements in Python, "using" statements in C#, etc.

If I saw in a code review that someone had releasing a mutex in garbage collection "as a fallback", I'd reject the code. There is no way to test that synchronisation directives are correct in your code - it could all work by coincidence each time you test it, and testing is non-deterministic. You have to get it right in the code - and be absolutely sure that it is correct. Such a "fallback" says you are not sure - so go back and re-write or re-structure the code until you are sure, and it's obvious that you are sure because it is obvious that the synchronisations are correct. There are situations where "defensive" programming is good, or where it is useful to "minimise" the damage that might result from bugs - this is not one of them.

(You are, of course, correct that the order of release of locks rarely matters.)

**coder** · 09 September 2022, 04:03 AM

Originally posted by DavidBrown View Post

There's no advantage to garbage collection for that kind of thing, and clear and obvious disadvantages

If you think it's better to have dangling locks and leak file descriptors (which also means likely unflushed data, as you mentioned) than to safeguard their release via garbage collection, then you're operating under a very different world view than I am. I didn't say there's no benefit to having explicit unlocks or file closures -- just that I'd want garbage collection as a backstop.

Originally posted by DavidBrown View Post

That's why with modern garbage collected languages, you generally do not use garbage collection for handling resources other than memory - you use try/finally blocks in Java, "with" statements in Python, "using" statements in C#, etc.

You're failing to distinguish between what the language actually implements vs. the prescribed best practices. If we take the example of a Python with statement, that simply builds atop the object's existing semantics. It closes the file or releases the lock because that's what the object's destructor does. So, you're wrong to say that Python doesn't use garbage collection for those things, but what it does is provide a better alternative mechanism.

Originally posted by DavidBrown View Post

If I saw in a code review that someone had releasing a mutex in garbage collection "as a fallback", I'd reject the code.

The question you need to ask is: what if you didn't catch it? For an arbitrary mutex, would it be better to have a dangling-lock bug that you might not even hit in testing, before the software is in the hands of customers? Or would you rather the lock get freed eventually? I know some people prefer the more catastrophic failure, but that presumes very good test coverage, which often isn't the case.

What about file descriptors, where the program leaks more and more fds, the longer it runs, until operations utilizing file descriptors just randomly start failing? Is that a preferable outcome?

I think you're being too idealistic.

**rmoog** · 09 September 2022, 04:30 AM

Originally posted by Developer12 View Post

I wouldn't expect him to ever include it, but I still laughed at the exclusion of rust.
On the one hand, it can do any job C can but doesn't allow those mistakes. It even has pointers (of a sort).
At the same time, GNU will stick to what it has until the project is dead. They'll never re-implement things.
Nevermind ideological differences like the propensity towards permissive licences in the rust community.

Announcement

Richard Stallman Announces GNU C Language Reference Manual

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment