Announcement

Collapse
No announcement yet.

Linus Torvalds Injects Tabs To Thwart Kconfig Parsers Not Correctly Handling Them

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ddriver
    replied
    Originally posted by Old Grouch View Post

    Which means that sharing a config file with someone for comments or removing configuration mishaps isn't generally possible: and if your object model changes, the config could change as well, so being able to interpret the binary config correctly could well be software version dependant. It also make independent detection and possible correction of corruption rather difficult.

    If you have come up with a better way, then perhaps it should become standard practice. Standard practices do need testing every now and then to see if they are still relevant or can be improved, and perhaps your way is the way forward. I will be surprised if it is generally adopted, but life is full of surprises.
    It is a model first approach through and through. This includes the full system state - to create a new application, to run an existing application, to make modifications to something pre-existing, if authorized to.

    All file operations go through that as well. In fact, under my model use, the user is not expected or encouraged to manually tamper with files. Doing a dumb file copy from one system to another is not supported - the file would not work on another installation, the proper way to initiate a file transfer is through the object model. You can still make a configuration snapshot and use the proper channel to transfer it to another system, which will transfer it in the target system's native format. It will be recognized for what it is on the other end, and any incompatibilities or breaking changes will be immediately highlighted when the configuration is received and resolved, before being applied.

    The architecture does remove a lot of legacy manual labor stuff, and that includes working with files the olden way. All that stuff is completely abstracted away, and can be implemented in a number of different back-ends, not necessarily a file storage device.

    Last but not least - having functionality where a critical section can be modified by means of parsing some arbitrary file anyone could have left of the system in any arbitrary state is a big no-no in my book. Under my usage model, unauthorized modification should not occur under any circumstances.
    Last edited by ddriver; 20 April 2024, 12:04 PM.

    Leave a comment:


  • anarki2
    replied
    Originally posted by curfew View Post
    This time he's wrong. Tabs have no purpose in anything. It's an ancient gimmick aimed at saving meaningless amounts of storage space and computing resources by inventing a silly character that can substitute multiple somewhat redundant characters, while also including annoying side-effects, and sometimes even bugs as evident. Those days are gone, have been for 30 years already.
    I hope you don't work on frontend. You confuse markup with content. This is the equivalent of putting multiple line breaks in the HTML text, instead of adjusting paragraph padding via CSS.

    Which one do you think makes more sense? Did you really want to have multiple empty paragraphs, or were you just lazy to use the appropriate tool for the task?

    If you don't understand the difference, that's fine, just stop judging Linus for that.

    Leave a comment:


  • Old Grouch
    replied
    It's worth remembering what a horizontal tab character actually means.

    It is an instruction to the device displaying the data in which the horizontal tab character is found to move the virtual print head to the next tab-stop after the current position of the virtual print head on the displaying device. The virtual print head does not backtrack to a previously 'unused' tab-stop.

    This has several consequences:

    1) It means that the tab stop settings are not necessarily described in the data itself. Although they can be in, for example, word-processing documents, they are usually extrinsic to the data, and a setting of the display device. It is a control character instructing the displaying device to do something. Different devices can have different defaults.

    2) It is not a character that instructs its own replacement by a (variable, possibly calculated) number of space characters, although that is one implementation.

    3) Tab stops also work when using proportional fonts, because they are not defined by character widths. They are 'simply' a distance from the start point of the (virtual) print head. In monospaced fonts, that can correspond with a multiple of a character width, but it need not necessarily do so. That distance can be measured in points, pica, inches, centimetres, furlongs, light-years, and also multiples of the (current) character width in the (current) font.

    4) If no tab stops are set, an instruction to move to the next one is difficult to interpret. In a mechanical typewriter, the carriage would move to the extreme left. This is a consequence of the mechanical implementation. Other possible behaviours are to ignore the horizontal tab character (i.e. it is zero width), or replace it with a single space character.

    This stackoverflow answer covers much the same: https://stackoverflow.com/questions/...of-systems-and

    I have no strong opinions on the use of horizontal tabs or spaces in formatting code.

    Vertical tab stops and the vertical tab character operate in the same way. The vertical tab stops are a distance from the top of the document page. A combination of horizontal tabs, carriage returns, and vertical tabs allows rapid movement around the print area which is useful when entering information onto pre-printed forms using impact printers. You get to the start of the next form using the form feed character, which could have an implied carriage return.

    This fast navigation to fields in a form is why the horizontal tab key is used to step through from field to field in on-screen forms.

    Leave a comment:


  • freerunner
    replied
    Originally posted by ddriver View Post
    Having the configuration defined as model allows to view and edit it in human readable form while in binary format. And you don't even need a text editor, just run the application in configuration mode. Plus the type safety and input validation and whatnot. And it actually shows you the configuration tree, so you don't have to know it by heart or look it up elsewhere.

    How is dumb old text better?
    Let's just wait for systemd integration. /s
    (How I despise a binary syslog)

    Leave a comment:


  • Old Grouch
    replied
    Originally posted by ddriver View Post
    But when I say "binary configuration" I mean a binary portable format, generated in and from a portable object model definition.

    In my architecture, I use a dynamic object model which in turn can generate context specific code from definitions in the object model format. The code is generated from the definitions, so it is intrinsically compatible with them, and any change to the definitions either dynamically resolves or returns the error before the code ever runs into it. Since the full system model is always available, you can detect logical errors or breaking changes even between multiple applications or libraries via real time static analysis of the entire system state before you ever run into them during execution.

    So it is quite advanced stuff actually, it can be expected to properly handle the binary format it defines, shocking as that might be!
    Thank you for the extensive reply. Upvoted.

    If I understand correctly, you need to be running your software, or at least something that can interpret the object model in order to correctly interpret the binary configuration.

    Which means that sharing a config file with someone for comments or removing configuration mishaps isn't generally possible: and if your object model changes, the config could change as well, so being able to interpret the binary config correctly could well be software version dependant. It also make independent detection and possible correction of corruption rather difficult.

    If you have come up with a better way, then perhaps it should become standard practice. Standard practices do need testing every now and then to see if they are still relevant or can be improved, and perhaps your way is the way forward. I will be surprised if it is generally adopted, but life is full of surprises.

    Thank you again for your comments. It's always interesting to see another point of view, and every so often I learn something and change my mind - quite probably something I don't do enough of, but then again, I am an Old Grouch.

    Leave a comment:


  • ddriver
    replied
    Originally posted by Old Grouch View Post

    I'm afraid we shall have to agree to disagree. I respect your position, but I would regard binary configuration files as 'an intrinsic property of the software architecture' to be a warning signal that I've got something wrong. One of the reasons for making things human-readable (and writeable) is the improved ease of recovering from unexpected errors. I certainly agree that writing robust parsers is non-trivial.

    Binary files tend to be non-portable, and difficult to analyse if you don't have a schema defining their structure, and unless you have taken extraordinary measures, are unlikely to be self-documenting.

    I certainly don't expect to convince you to change your practice: but at least I hope that you acknowledge that there are other ways of doing things that other people might prefer, even if for reasons that you personally believe to be invalid or irrelevant.

    May all your programs be bug free!

    You continue to misunderstand. When you hear "binary configuration" you think of the clumsy problematic legacy way that would be handled, and its resulting downsides of incompatibility and poor manual maintenance. And I actually agree - nobody wants that!

    But when I say "binary configuration" I mean a binary portable format, generated in and from a portable object model definition.

    In legacy "coding", you type text, to parse and ultimately convert it to a static object model for further code generation. Then a static binary blob with nearly zero introspection is generated, where the cpu executes code in the blind, with the presumption there's no undefined behavior.

    In my architecture, I use a dynamic object model which in turn can generate context specific code from definitions in the object model format. The code is generated from the definitions, so it is intrinsically compatible with them, and any change to the definitions either dynamically resolves or returns the error before the code ever runs into it. Since the full system model is always available, you can detect logical errors or breaking changes even between multiple applications or libraries via real time static analysis of the entire system state before you ever run into them during execution.

    So it is quite advanced stuff actually, it can be expected to properly handle the binary format it defines, shocking as that might be!

    However, it is still damn nice of you that you regret that you won't sway me from my erroneous ways. I would have so much liked to be convinced to go back to the mess that got me enough motivated to unlearn the tedious, time wasting and creativity restricting standard practices and come up with something a tad more useful.

    Leave a comment:


  • Old Grouch
    replied
    Originally posted by ddriver View Post

    It is not a matter of some insane micro optimization, it is an intrinsic property of the software architecture. I didn't go out of my way to write a redundantly efficient configuration system, it just works "as is" on top of the object and memory models that I am using across all my software.

    One of the benefits is I will definitely not find myself in a situation I have to inject tabs into text to avoid critical failures. That's quite ridiculous to be honest. A dumb solution to a dumb problem that should have long been rectified.

    Today you can afford to parse text for no good reason, tomorrow you can afford to inject tabs while at it. Where does it end? You can't keep patching bad software design with more bloat or waste hardware resources on it - that only makes it worse over time.
    I'm afraid we shall have to agree to disagree. I respect your position, but I would regard binary configuration files as 'an intrinsic property of the software architecture' to be a warning signal that I've got something wrong. One of the reasons for making things human-readable (and writeable) is the improved ease of recovering from unexpected errors. I certainly agree that writing robust parsers is non-trivial.

    Binary files tend to be non-portable, and difficult to analyse if you don't have a schema defining their structure, and unless you have taken extraordinary measures, are unlikely to be self-documenting.

    I certainly don't expect to convince you to change your practice: but at least I hope that you acknowledge that there are other ways of doing things that other people might prefer, even if for reasons that you personally believe to be invalid or irrelevant.

    May all your programs be bug free!

    Leave a comment:


  • c117152
    replied
    Originally posted by curfew View Post
    That sounds like a highly specialized concept with very strict restrictions on usage then.
    The tab stop was originally designated for tabulation and indentation and it's exactly what it does with elastic tab stops. Early typewriter models had adjustable width tabs ( https://www.instructables.com/Making...th-Typewriter/ ‚Äč) and later models had tabulation modes that applied algorithm for automated text alignment and such: https://shorthandtypist.wordpress.co...er-tabulators/

    Elastic tabstop is simply the next logical step with respect to the differences between typewriters and text editors. But regardless, using tabs for indentation and having adjustable tabstop width in one's editor is simply using tabs for what they're meant for in their intended, albeit adjusted for computers rather than typewriters, use case.

    Leave a comment:


  • rogerx
    replied
    Originally posted by curfew View Post
    This time he's wrong. Tabs have no purpose in anything. It's an ancient gimmick aimed at saving meaningless amounts of storage space and computing resources by inventing a silly character that can substitute multiple somewhat redundant characters, while also including annoying side-effects, and sometimes even bugs as evident. Those days are gone, have been for 30 years already.
    Why I enjoy Assembly programming.

    Leave a comment:


  • ddriver
    replied
    Originally posted by Old Grouch View Post
    If the rest of your application is so good that the best thing you can do is optimize the config files, then I'm in awe.
    It is not a matter of some insane micro optimization, it is an intrinsic property of the software architecture. I didn't go out of my way to write a redundantly efficient configuration system, it just works "as is" on top of the object and memory models that I am using across all my software.

    One of the benefits is I will definitely not find myself in a situation I have to inject tabs into text to avoid critical failures. That's quite ridiculous to be honest. A dumb solution to a dumb problem that should have long been rectified.

    Today you can afford to parse text for no good reason, tomorrow you can afford to inject tabs while at it. Where does it end? You can't keep patching bad software design with more bloat or waste hardware resources on it - that only makes it worse over time.
    Last edited by ddriver; 17 April 2024, 12:11 PM.

    Leave a comment:

Working...
X