Announcement

Collapse
No announcement yet.

Mono 2.11.3 Packs In Microsoft's Entity Framework

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #46
    I hope you realize that the reason to use a DBMS is not just to have a way to store the information. There is much more about it even for small use cases such as:
    - transactions
    - locking
    - ACID
    - security
    - indexing
    - caching
    - ....

    And there are many types of DBMS with a different feature set. Depending on the use case you may find different DBMS useful for the same application. If the application uses a ORM the underlying DBMS can easily be replaced without touching the application itself.

    So yeah... it turns out (as always) that the world is not black and white and that there is no way to generally say a ORM is bad/good.

    Comment


    • #47
      Originally posted by droste View Post

      - transactions
      - locking
      - ACID
      - security
      - indexing
      - caching
      - ....
      .
      Berkeley DB has ALL of those, except caching (a good thing, keep reading).

      "Indexing"??? Databases don't pick the indexes themselves. You have to pick them and make them. Again if you want an index it's just another Berkeley DB table (that's how the underlying database would do it anyway).

      If the application uses a ORM the underlying DBMS can easily be replaced without touching the application itself.
      Nope, for example if you are using Hibernate you have to recompile your projected classes if you remove columns.

      If you want to deal with removed columns, you can play with views and calculated fields but then you are just slowing down the database even more.

      Hahaha on you if you want ACID and caching in the same application!

      If you want to cache local results then make a HashMap and just store the keys and data locally. It's FASTER than using the database cache because you can index the cached objects directly.

      Again why do you want to carry around so much baggage when JSON.encode and JSON.decode will do all of the work for you?

      Comment


      • #48
        First: I only listed a few things DBMS do. There is more.

        I'm not arguing with you that there are cases where a DB is too much. But I claim that JSON and "schema-free" database are not the solution to all data storage problems. That was the point of my post. If it's a nail, use a hammer, if it's a screw use a screwdriver. And it's not as simple as "is the application small use JSON otherwise DB".

        PS: My main problem with JSON is the missing meta data / schema (such as datatypes, constraints, etc...). I recognize that you don't like defined schemes but there are good reasons to have them. This gets very funny if you have dates for example and you get "12.10.11". What the hell does that mean? Which is the year, the month and the day? Is it year 2000+? I know that there is such thing as a JSON schema, but I haven't seen this in real life yet.

        Comment


        • #49
          Originally posted by droste View Post
          But I claim that JSON and "schema-free" database are not the solution to all data storage problems..
          We are not talking about "all data storage problems"

          We are talking about the kinds of applications where one might choose to use Microsoft's Entity Framework.

          Of course there are applications like payroll and A/R where SQL and a big database are the way to go, because the columns are fixed and you're doing big joins across a dozen tables.

          But here we are talking about "free form" applications where the O/R mapping is not straightforward. For example if you are writing a "facebook" type app and you want to let your users have a list of favorite colors, then you are going to have to make some ugly decisions on how to write your DDL if you want to use SQL. Or suppose you want to make networks of friends... This is the entire reason for using an "entity framework", because it's too complex to just code up some simple SQL queries for the job.

          This gets very funny if you have dates for example and you get "12.10.11". What the hell does that mean? Which is the year, the month and the day?
          As if every computer language today didn't have standard date conversion routines. Geez, just pick ODBC format or something and get on with it. If your application is writing dates like that, that's YOUR bug.

          My main problem with JSON is the missing meta data / schema (such as datatypes, constraints, etc...)
          Just bundle up the metadata in its own JSON object and store that, too! Again THE SQL DATABASE IS DOING THE EXACT SAME THING UNDER THE COVERS.
          Last edited by frantaylor; 08-14-2012, 08:45 PM.

          Comment


          • #50
            Originally posted by jrch2k8 View Post
            sadly nope, microsoft did this not for the goodness of their heart or cause they believe in open source[they not], they do it for this reasons:
            And this is my problem with the blind Microsoft hatred. Microsoft has some 150,000 people working for them between employees and contractors. Tons of those people are huge fans of Open Source and even Linux. People go to Microsoft for all kinds of reasons, including liking the other people who work there, believing in the products, wanting to advance their careers and getting resume fodder, being interested in the crazy R&D projects that most people never hear about, or just to make lots of money (in many cases, just for long enough to pay off student loans or otherwise get a good financial footing before moving on to a "better" job).

            When you have that many people there for that many different reasons with a huge variety of backgrounds, it is _absolutely fucking retarded_ to think that everything the company does is some corporate policy handed down by the executives. The company is not under the control of a handful of people. Much of what Microsoft does is driven by the people at the bottom of the totem pole. Ballmer has surprisingly little control, and much less than a CEO in most companies; there's a very, very strong culture within the Microsoft campuses, and people and their opinions and ideas carry a surprising amount of weight. Sometimes this is not a good thing (just because a bunch of people think something is a good idea doesn't mean it is) and other times it is a very good thing (the acceptance of Open Source as a viable means of distributing and supporting software).

            The rise of Open Source at Microsoft is surely very, very slow. But it's happening because people on the inside keep pushing for it. Not because Ballmer is some kind of mad genius with a 190 IQ that can plan and manipulate the entire world by his every whim.

            Conspiracy theorists always end up giving _waaay_ too much credit to the people least deserving of it.

            Comment


            • #51
              Originally posted by frantaylor View Post
              For example if you are writing a "facebook" type app and you want to let your users have a list of favorite colors, then you are going to have to make some ugly decisions on how to write your DDL if you want to use SQL.
              I fail to see how this is difficult to do in SQL. Maybe I don't understand what you try to achieve but this would be my approach:

              Tables:
              Users (userid, username)
              FavoriteColors (userid, colorname)

              Add color:
              INSERT INTO FavoriteColors VALUES (123, 'green')

              Remove color:
              DELETE FROM FavoriteColors WHERE userid=123 AND colorname='green'

              Get all colors:
              SELECT colorname FROM FavoriteColors WHERE userid=123


              Objects:
              User (userid, ..., favoritecolors (a list))

              the O/R mapping is straight forward: A list of something inside an object -> a table of something with the primary key of the object as a foreign key.

              Doesn't look to difficult for me.

              Originally posted by frantaylor View Post
              THE SQL DATABASE IS DOING THE EXACT SAME THING UNDER THE COVERS
              Soooo... why reinvent the wheel if the database is doing the same thing? ;-)

              Comment


              • #52
                Originally posted by droste View Post
                I fail to see how this is difficult to do in SQL.
                Here's what I'm talking about with "demo" applications that look good on paper but suck in production.

                I specifically said "facebook" because you are immediately talking about a user table with about 5 million rows in it.

                If you try loading up that colors table with all your users and all your colors, then performance is going to suck, unless you make indexes for all the queries. Just think, that color table of yours is going to have about 15 million rows if you have 5 million users. It's a terrible waste of resources for a table that has very little actual information in it.

                After you're done making all the indexes, you will find that INSERT performance now sucks, because you have to calculate all those indexes every time you do an insert.

                If you limit the number of colors available, the users will complain. If you don't limit the number of colors, then the size of that table will grow and the database will have to re-index it every time it changes.

                See now we have introduced complexity and indexes and performance issues and all we really wanted was a simple list of colors.

                If you use JSON then you get the entire user's profile with one indexing operation and performance will be great.

                Indexing a table with millions of rows is hard work for the database. There are disk fetches and cache misses and it's generally not a fast operation. You don't want to do it any more than you have to.

                You don't need to write a silly loop to extract all the colors from the sql table and plug them into the user object.

                The kind of code you are talking about is always rife with bugs because it's hard to do right. It's SO easy to make a typo that's hard to catch. You are whittling down the square pegs to fit them into round holes. The whole reason companies write ORM frameworks is because the result-set fetching loops required to support your SQL queries, they are bug magnets.

                There's an old quote from somewhere that I like to pull out at times like this:

                "the most reliable and bug free parts of any system are the ones that aren't there"

                again why clutter up your app with bug-prone code when you can just call JSON.decode()?

                and you can also ask facebook if they run SQL queries inside of their pages, they will say "heck, no!"

                If you want to do fast lookups on users based on something like favorite colors, you use "bitmap indexes" and SQL is of limited help to you.

                What you do is take a bigint field and use one bit for each color. If you have 128 bit bigints then you have more colors than Crayola. Then you can index this field and you can ask "what people have blue and green as their favorite colors" and my query will run a heck of a lot faster than yours will. Of course this solution fixes the colors, and no user-defined colors are allowed, but the performance is excellent.

                If you don't like JSON then you can use google protocol buffers. They're blazing fast.
                Last edited by frantaylor; 08-14-2012, 11:47 PM.

                Comment


                • #53
                  Originally posted by droste View Post
                  Soooo... why reinvent the wheel if the database is doing the same thing? ;-)
                  Why drag an ocean liner around with you when the only part you are using is the captain's chair?

                  If you're burdening the database with extensive consistency checks every time you do an insert, your performance is going to be terrible! Often it takes more cycles to verify the data than it does to save it! Constraints are for testing, take them out in production unless you like spending lots of money on big database hardware. Those big IBM AIX boxes will make short work of this kind of stuff but they cost a king's ransom.
                  Last edited by frantaylor; 08-14-2012, 11:56 PM.

                  Comment


                  • #54
                    Originally posted by frantaylor View Post
                    If you try loading up that colors table with all your users and all your colors, then performance is going to suck
                    Why would I try to do that? I don't see a reason to get all users and especially not with all colors. You would setup the color to be lazy loaded so you don't have the overhead when you just want the user.

                    Originally posted by frantaylor View Post
                    If you use JSON then you get the entire user's profile with one indexing operation and performance will be great.
                    Unless there are many information stored within the user and all I wanted is the username for example. I didn't need to load the huge amount of other information for just his username but with JSON I have to.

                    Originally posted by frantaylor View Post
                    You don't need to write a silly loop to extract all the colors from the sql table and plug them into the user object.
                    That's what the ORM is for ;-) You don't do it, it does it for you.

                    Originally posted by frantaylor View Post
                    The kind of code you are talking about is always rife with bugs because it's hard to do right. It's SO easy to make a typo that's hard to catch. You are whittling down the square pegs to fit them into round holes. The whole reason companies write ORM frameworks is because the result-set fetching loops required to support your SQL queries, they are bug magnets.
                    So? If the ORM framework removes the burden to maintain the error prone SQL stuff, it is a good thing.

                    Originally posted by frantaylor View Post
                    What you do is take a bigint field and use one bit for each color. If you have 128 bit bigints then you have more colors than Crayola. Then you can index this field and you can ask "what people have blue and green as their favorite colors" and my query will run a heck of a lot faster than yours will. Of course this solution fixes the colors, and no user-defined colors are allowed, but the performance is excellent.
                    Yeah... if I fix the solutions I would set up a table for green and a table for blue and the query would just be "give me the hole content of this". No indexing required and we're back to high speed again.

                    And again: I'm not arguing with you that there are workloads where SQL is bad, but I still say that ORM is not generally bad. There a only a few applications that have to handle the workload facebook or google have.

                    Originally posted by frantaylor View Post
                    Constraints are for testing, take them out in production unless you like spending lots of money on big database hardware. Those big IBM AIX boxes will make short work of this kind of stuff but they cost a king's ransom.
                    This might work if your application is the only one that is working with the data, but if you have many applications that work with the same data, the database is the only place to enforce some constraints, to keep existing applications working by not suddenly changing some datatypes or formats for example.

                    Comment


                    • #55
                      Originally posted by frantaylor View Post
                      In other words it's the new "Microsoft Access": quickly create inefficient applications that use huge resources, are unscalable, and must be re-written from scratch when they need to be scaled.

                      Toys for boys
                      I don't want to be a MS poofter, but it seems that you never understand what MS Access was and/or what Entity is. If you will say that Visual Studio Lightswitch is the MS Aceess, it would make some sense (even is a loose comparison).
                      If you meant that Entity is the new ADO.Net (which is probably what you think of), I agree, it is built on top of Ado.Net, it behaves well with Linq and do a lot of things transparently.

                      What is so unscalable with Entity? Entity is used inside of Bing, which even is not Google, is still scalable, isn't so? You can't execute it from multiple threads? The overhead of CLR reflection is too big?
                      Did you try to make a 64 thread application and run it on a machine like this one?

                      Even it would be not so scalable (even this wording seems is fairly reduntant), most of the time the database backend (the RDBMS itself) has to be performant, not the thin client querying the response.

                      Or for you Mono is not scalable, and based on that, you mean that Entity will be just another bloat...
                      This means that you run most likely a browser that it's JS runs at 1/2 of speed of Mono, in the best case (when is close to C like coding), and its SQLite database doesn't scale, means that we all running slow inefficient applications, right, right?
                      So for this reason you use text based console browser?

                      At the end, are you a developer? Are you a troll that it happen to dislike Mono?
                      If you're using Android phone, are you against Android VM which is much slower than Mono For Android. If you're using Ubuntu, Fedora and not only, a lot of your tools are written in the slow Python which have Global Lock, which this is for really unscalable and slow in multithreaded applications.
                      Last edited by ciplogic; 08-15-2012, 04:08 AM.

                      Comment


                      • #56
                        I just installed one of our MVC3 apps to Linux/Apache/Mono. It worked. If Mono saves me money on server OS costs, I'm all for it. As far as EF is concerned, I see it like Rails. If the ORM helps me solve small-med projects fast I'll use it. I'm happy to see EF on Linux!

                        Comment


                        • #57
                          Oh and anyone here use Hibernate? I have for some very large projects. The cost needed for performance HQL tuning/lazy loading etc was more than made up in a build environment that supported quick and easy schema changes which in turn made it super easy to fold in new devs quickly with a minimal amount of domain specific training to get them up and running.

                          Of course you can't just toss ORM at a large project and hope it works!!!

                          Comment


                          • #58
                            Originally posted by droste View Post
                            Why would I try to do that? I don't see a reason to get all users and especially not with all colors. You would setup the color to be lazy loaded so you don't have the overhead when you just want the user.
                            You know that query execution time depends on the number of rows IN THE TABLE not the number of rows returned from the query, right? If you have 15 million rows in a table, it's going to take a LOT longer to select one row than a table with just a few entries?

                            That's what the ORM is for ;-) You don't do it, it does it for you.
                            NO, ORM does NOT manage your data for you. It does NOT make decisions about how to store your data based on how much data you have. ORM stands for "Object-Relational Model". ALL it does is turn your object into SQL rows and then back again. It does NOT pick your indexes (it probably makes a primary key, but that's not enough for your color table), it does NOT tell you that your color table is a lousy design.

                            There a only a few applications that have to handle the workload facebook or google have.
                            How do you know this? What about sites that have finite budgets for computer hardware? They must accomodate their users with limited hardware so they must squeeze performance wherever they can.

                            If you actually want to enforce constraints on a busy database, you OFFLOAD the constraint calculations to another system. You can do this by making the data available to other apps via SOAP or some such, and write a small server that does the constraint calculations and then stores the data in the database.

                            You don't want your database engine to be calculating constraints! You want your database to be fetching and storing data.

                            suddenly changing some datatypes or formats for example
                            This is another reason for using a SOAP-like service for applications: you don't WANT your applications to know or care about how the data is stored in your database. If you expose SQL then you will be constantly updating your apps as you change the schema, unless you use SQL views, but they slow things down, too.

                            Also SQL data permissions often don't line up with your application. You will encounter situations where SQL permissions are not able to adequately represent your security requirements. In these cases you must write an intermediate agent to do the security for you, which would be part of the SOAP engine I mentioned above.

                            Separating the mechanics of the data storage from the application is a good design principle. If your stated worry is real, this is how you deal with it.

                            Oh and by the way: did you understand why I said "you have choices" when I described the "favorite colors" requirement? It's because your way of doing it is the obvious and slow one. Like I said, bitmap indexes can convey the same information and they are much much faster than your method.
                            Last edited by frantaylor; 08-15-2012, 10:55 AM.

                            Comment


                            • #59
                              Originally posted by frantaylor View Post
                              "CREATE TABLE myData (K INTEGER AUTO_INCREMENT PRIMARY KEY, D VARCHAR(262144))"
                              d = JSON.encode(myObject);
                              sql.execute("INSERT INTO myData(D) VALUE (?)", d)
                              key = sql.lastInsertedKey ...
                              ---
                              sql.execute("SELECT D FROM myData WHERE K=?", key)
                              myObject = JSON.decode(sql.fetch)
                              ---
                              This is where my forehead slammed into my palm.

                              Originally posted by frantaylor View Post
                              really if you want an "entity framework" or an "ORM" for a quick demo application you can just use a two-column table and JSON
                              Before the railing.... For the record, I don't use ORM. I rolled my own interfaces to make MySQL and Oracle databases _look similar_ in basic access, but I still write my SQL.

                              Your JSON example is an extremely poor solution to problems ORM strives to solve. It isn't as simple as object preservation, you have to be able to get the data out, and compare it to other data structures. Lets say I'm an online book retailer and I want to look at all Mystery books under $10. Using your solution, I'm basically going to have to load the entire database into RAM and then de-serialize each and every record (parsing text, allocating memory) it back into an object, before I can finally filter it by the criteria I'd like.

                              Even a basic key-value pair schema (which is what Microsoft's "Entity Framework" does) is a better option... and with that, good luck when you have schemas with trillions of rows.

                              Real ORM is a developer written class that acts as an extension to an ORM framework, implementing very specific interfaces. This class handles all custom logic for your database schema in an optimized and schema-specific manner. Once you've done this work, you can reap the benefits of not having to do it again... every query being optimized for you, automatically. Your solution solves only problems that personal websites encounter (and poorly), not something with decent volume.

                              Comment


                              • #60
                                Originally posted by frantaylor View Post
                                You know that query execution time depends on the number of rows IN THE TABLE not the number of rows returned from the query, right? If you have 15 million rows in a table, it's going to take a LOT longer to select one row than a table with just a few entries?
                                No not right, that is not how database queries work. The time the database spends to find the row is not the only thing that costly. The overall speed depends on how many data is in the table, what data you want from the row, how you search for one row and on how you transport the data to the application.

                                A few examples:
                                - Imaging the following DB2 SQL query: SELECT myPrimaryKey FROM myTable FETCH FIRST 1 ROWS ONLY. This will be fast no matter how many entries are in the table.
                                - When you only query data that is indexed the database can highly reduce disk I/O (the slow part).
                                - If you have a network in between you and the database preselecting the data you need can highly reduce your network traffic.

                                Have you ever actually used a database? And not every program written presents you with a website viewable via the internet. There are many programs out there that just present data from a DB within a thin client to the user. It's not a good idea to always send all information you have to the client, because it slows down the client really fast.

                                Comment

                                Working...
                                X