LibreOffice 25.2 Alpha 1 Open-Source Office Suite Released

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Old Grouch
    Senior Member
    • Apr 2020
    • 669

    #81
    Originally posted by oiaohm View Post

    Depends on version of standard. Early version of ISO/IEC 26300 defines ZIP as infozip 199x something that was only 32 bit zip as in 4g limit then the newer version of standard update to 2004 pkware define of zip that includes ZIP64 that takes this to effectively 16EiB. This is one of those not intentional changes it was we will just update the document to use the official newer Zip format to get security things and the ZIP64 came in as by product..
    Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

    This is why its 2023 when Libreoffice gets ZIP64 support.

    Next killer is the Open Document format does not in standard define the max number of rows and columns on a spreadsheet.


    In Excel 2010, the maximum worksheet size is 1,048,576 rows by 16,384 columns. In this article, find all workbook, worksheet, and feature specifications and limits.

    Yes items like calc and excel has limits on max rows and columns and so do all spreadsheet programs. But these limits means your application is technically non conforming to the outer edges of the standard.

    Lot of data tables for statistical processing get over the 1,048,576 rows

    The R import and export options for ods really go for the max possible the ISO/IEC 26300 allows. 10 to 20 times the number of rows Calc or Excel will accept no problems right.
    The fact that ODF v1 defined ZIP via Info-ZIP Application Note 970311, 1997. I regard as somewhat embarrassing. But given the group in OASIS defining ODF were document format experts, and not data compression experts, I guess it was simply missed until it was an unreasonable amount of effort to change or remedy.
    I think it is fine that the maximum number of rows or columns has no artificial limit: an otherwise conforming application can point out what its maximum limits are, and fail with an appropriate and informative error message if presented with a file that exceeds the application limits.

    I think it would be reasonable for a future revision of the ODF standard to include a newer compression method (or even several), but retain backwards compatibility by allowing no compression, or ZIP compression.

    It 'would be nice' if any future compression standard used by ODF were an ISO standard and FLOSS, I'm not sure if this has yet been achieved: ZIP file format, Standardization

    Comment

    • oiaohm
      Senior Member
      • Mar 2017
      • 8257

      #82
      [QUOTE=Old Grouch;n1510483]The fact that ODF v1 defined ZIP via Info-ZIP Application Note 970311, 1997. I regard as somewhat embarrassing.

      Open Document Format for Office Applications (OpenDocument) Version 1.3 OASIS Standard 27 April 2021​

      [ZIP] PKWARE Inc. Zip APPNOTE Version 6.2.0,
      https://www.pkware.com/support/appli...-note-archives, 200​4
      This is the same in 1.4 as well.

      There is no clean link to it as with 1.3 but the change from Info-ZIP to pkware 2004 is 1.2

      Old Grouch OpenDocument/ODF version numbers are 2 numbers. 1.0 and 1.1 are the info-zip 1997 that is original 32 bit only zip. 1.2 and newer is the PKWARE define that contains zip64. Yes this slipped under the radar and got into ODF in 2011.

      Originally posted by Old Grouch View Post
      I think it would be reasonable for a future revision of the ODF standard to include a newer compression method (or even several), but retain backwards compatibility by allowing no compression, or ZIP compression.​
      I am fine with that. The issue is that the change got in without being noticed.

      Originally posted by Old Grouch View Post
      It 'would be nice' if any future compression standard used by ODF were an ISO standard and FLOSS, I'm not sure if this has yet been achieved: ZIP file format, Standardization

      When you open up that standard you notice its hey lets just use a newer pkware zip format version being 6.3.3 then proceed to forbid things like splitting zip archives across multi files.

      Yes the problem of ISO IEC 21320-1 2015 is it effectively makes a sub format of the PK Ware zip format that not compatible with normal zip tools.

      I would like to see a ISO version of ZIP but ISO IEC 21320-1 2015 is absolutely not it.

      Originally posted by Old Grouch View Post
      ​I think it is fine that the maximum number of rows or columns has no artificial limit: an otherwise conforming application can point out what its maximum limits are, and fail with an appropriate and informative error message if presented with a file that exceeds the application limits.​
      The point I was making is that you do see ods documents with larger row and columns than any spreadsheet program can open. Yes you see people doing data analysis​ passing around spreedsheet files but it is very common they are files no spreedsheet program.

      R language addon xlsx​ generator does not check the row or column limit either and will very happy read/write a xlsx document larger than Excel and any of it other spreadsheets can open as well.

      This is one of the problems people see ODS or XLSX files been passed between people doing data analysis and presume those people are using excel or libreoffice or something spreadsheet then you notice how often these files being passed around are bigger than those programs can handle so they are coming out of something other than a spreadsheet program because today all spreadsheet programs have the same upper limits.

      R and Python has replaced Excel fora lot of data analysis and this can be seen in the spreadsheet files being passed around that are too large for spreadsheet program to process. Yes malicious compliance the boss has told data analysis people to use a spreadsheet output format but they don't want the issues using a spreadsheet program bring that are part of the design of spreadsheet itself.

      Yes lot of the reasons why you must have a spreadsheet program is false when you look at industry and find more and more of those use cases are done by data analysis using R or Python and this can be seen in the generated files.

      The data analysis on libreoffice development to monitor development is done in R not Calc. Gets better Microsoft developers have also said the data analysis on MS Office development is also done in R.

      The big thing that data analysis need out a spreadsheet program going forwards is to be able to access legacy formats to access old data and the only spreadsheets for that is ones based off libreoffice code base.

      Yes people spend all this time learning how to use Excel for data analysis to get into the workforce these days to be told you are forbid from using it because you must use R or Python because Excel is not trusted due to the issues with spreadsheets and the mistakes that will happen when humans use spreadsheets along with Excel designed in defects that cause Excel to miss calculate.

      Comment

      • Old Grouch
        Senior Member
        • Apr 2020
        • 669

        #83
        Originally posted by oiaohm View Post

        I am fine with that. The issue is that the change got in without being noticed.
        I agree that the change getting in without being noticed is poor.


        Originally posted by oiaohm View Post
        I would like to see a ISO version of ZIP but ISO IEC 21320-1 2015 is absolutely not it.
        I'd like to see an ISO (FLOSS) data-at-rest compressor/decompressor. I also am not convinced that ISO IEC 21320-1 2015 is the best option for this.


        Originally posted by oiaohm View Post
        Yes people spend all this time learning how to use Excel for data analysis to get into the workforce these days to be told you are forbid from using it because you must use R or Python because Excel is not trusted due to the issues with spreadsheets and the mistakes that will happen when humans use spreadsheets along with Excel designed in defects that cause Excel to miss calculate.
        Excel is a terrible tool for data analysis, but it is just good enough and cheap enough to make naïve users not want to buy* and learn how to use something more appropriate, When Microsoft's Office suite is the standard installation for businesses, it is very hard to justify a separate tool when Excel is regarded as 'good enough' by people who ought to know better, but don't.

        *Or fight with a corporate IT department to install FLOSS software.

        Comment

        • oiaohm
          Senior Member
          • Mar 2017
          • 8257

          #84
          Originally posted by Old Grouch View Post
          Excel is a terrible tool for data analysis, but it is just good enough and cheap enough to make naïve users not want to buy* and learn how to use something more appropriate, When Microsoft's Office suite is the standard installation for businesses, it is very hard to justify a separate tool when Excel is regarded as 'good enough' by people who ought to know better, but don't.

          *Or fight with a corporate IT department to install FLOSS software.
          https://posit.co/products/open-source/rstudio/ Rstudio no charge FLOSS for but it has a few limitations then its over 1000 USD per install per year commercial pro edition.

          R and python with all the features is absolutely not cheap. It departments not wanting to install FLOSS versions where they suit end up making this cost way way worse. Yes Rstudio allows having some of your users on the FLOSS and some of your users on the pro.

          Parties like posit.co can only change so much for these tools due to how much mistakes done with Excel/spreedsheets in fact cost. Lot of training places still teach only Excel for data analysis even after how well documented spreadsheets are not up to the job.

          Comment

          Working...
          X