Announcement

Collapse
No announcement yet.

Mono 2.11.3 Packs In Microsoft's Entity Framework

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by frantaylor View Post
    For example if you are writing a "facebook" type app and you want to let your users have a list of favorite colors, then you are going to have to make some ugly decisions on how to write your DDL if you want to use SQL.
    I fail to see how this is difficult to do in SQL. Maybe I don't understand what you try to achieve but this would be my approach:

    Tables:
    Users (userid, username)
    FavoriteColors (userid, colorname)

    Add color:
    INSERT INTO FavoriteColors VALUES (123, 'green')

    Remove color:
    DELETE FROM FavoriteColors WHERE userid=123 AND colorname='green'

    Get all colors:
    SELECT colorname FROM FavoriteColors WHERE userid=123


    Objects:
    User (userid, ..., favoritecolors (a list))

    the O/R mapping is straight forward: A list of something inside an object -> a table of something with the primary key of the object as a foreign key.

    Doesn't look to difficult for me.

    Originally posted by frantaylor View Post
    THE SQL DATABASE IS DOING THE EXACT SAME THING UNDER THE COVERS
    Soooo... why reinvent the wheel if the database is doing the same thing? ;-)

    Comment


    • #52
      Originally posted by droste View Post
      I fail to see how this is difficult to do in SQL.
      Here's what I'm talking about with "demo" applications that look good on paper but suck in production.

      I specifically said "facebook" because you are immediately talking about a user table with about 5 million rows in it.

      If you try loading up that colors table with all your users and all your colors, then performance is going to suck, unless you make indexes for all the queries. Just think, that color table of yours is going to have about 15 million rows if you have 5 million users. It's a terrible waste of resources for a table that has very little actual information in it.

      After you're done making all the indexes, you will find that INSERT performance now sucks, because you have to calculate all those indexes every time you do an insert.

      If you limit the number of colors available, the users will complain. If you don't limit the number of colors, then the size of that table will grow and the database will have to re-index it every time it changes.

      See now we have introduced complexity and indexes and performance issues and all we really wanted was a simple list of colors.

      If you use JSON then you get the entire user's profile with one indexing operation and performance will be great.

      Indexing a table with millions of rows is hard work for the database. There are disk fetches and cache misses and it's generally not a fast operation. You don't want to do it any more than you have to.

      You don't need to write a silly loop to extract all the colors from the sql table and plug them into the user object.

      The kind of code you are talking about is always rife with bugs because it's hard to do right. It's SO easy to make a typo that's hard to catch. You are whittling down the square pegs to fit them into round holes. The whole reason companies write ORM frameworks is because the result-set fetching loops required to support your SQL queries, they are bug magnets.

      There's an old quote from somewhere that I like to pull out at times like this:

      "the most reliable and bug free parts of any system are the ones that aren't there"

      again why clutter up your app with bug-prone code when you can just call JSON.decode()?

      and you can also ask facebook if they run SQL queries inside of their pages, they will say "heck, no!"

      If you want to do fast lookups on users based on something like favorite colors, you use "bitmap indexes" and SQL is of limited help to you.

      What you do is take a bigint field and use one bit for each color. If you have 128 bit bigints then you have more colors than Crayola. Then you can index this field and you can ask "what people have blue and green as their favorite colors" and my query will run a heck of a lot faster than yours will. Of course this solution fixes the colors, and no user-defined colors are allowed, but the performance is excellent.

      If you don't like JSON then you can use google protocol buffers. They're blazing fast.
      Last edited by frantaylor; 14 August 2012, 11:47 PM.

      Comment


      • #53
        Originally posted by droste View Post
        Soooo... why reinvent the wheel if the database is doing the same thing? ;-)
        Why drag an ocean liner around with you when the only part you are using is the captain's chair?

        If you're burdening the database with extensive consistency checks every time you do an insert, your performance is going to be terrible! Often it takes more cycles to verify the data than it does to save it! Constraints are for testing, take them out in production unless you like spending lots of money on big database hardware. Those big IBM AIX boxes will make short work of this kind of stuff but they cost a king's ransom.
        Last edited by frantaylor; 14 August 2012, 11:56 PM.

        Comment


        • #54
          Originally posted by frantaylor View Post
          If you try loading up that colors table with all your users and all your colors, then performance is going to suck
          Why would I try to do that? I don't see a reason to get all users and especially not with all colors. You would setup the color to be lazy loaded so you don't have the overhead when you just want the user.

          Originally posted by frantaylor View Post
          If you use JSON then you get the entire user's profile with one indexing operation and performance will be great.
          Unless there are many information stored within the user and all I wanted is the username for example. I didn't need to load the huge amount of other information for just his username but with JSON I have to.

          Originally posted by frantaylor View Post
          You don't need to write a silly loop to extract all the colors from the sql table and plug them into the user object.
          That's what the ORM is for ;-) You don't do it, it does it for you.

          Originally posted by frantaylor View Post
          The kind of code you are talking about is always rife with bugs because it's hard to do right. It's SO easy to make a typo that's hard to catch. You are whittling down the square pegs to fit them into round holes. The whole reason companies write ORM frameworks is because the result-set fetching loops required to support your SQL queries, they are bug magnets.
          So? If the ORM framework removes the burden to maintain the error prone SQL stuff, it is a good thing.

          Originally posted by frantaylor View Post
          What you do is take a bigint field and use one bit for each color. If you have 128 bit bigints then you have more colors than Crayola. Then you can index this field and you can ask "what people have blue and green as their favorite colors" and my query will run a heck of a lot faster than yours will. Of course this solution fixes the colors, and no user-defined colors are allowed, but the performance is excellent.
          Yeah... if I fix the solutions I would set up a table for green and a table for blue and the query would just be "give me the hole content of this". No indexing required and we're back to high speed again.

          And again: I'm not arguing with you that there are workloads where SQL is bad, but I still say that ORM is not generally bad. There a only a few applications that have to handle the workload facebook or google have.

          Originally posted by frantaylor View Post
          Constraints are for testing, take them out in production unless you like spending lots of money on big database hardware. Those big IBM AIX boxes will make short work of this kind of stuff but they cost a king's ransom.
          This might work if your application is the only one that is working with the data, but if you have many applications that work with the same data, the database is the only place to enforce some constraints, to keep existing applications working by not suddenly changing some datatypes or formats for example.

          Comment


          • #55
            Originally posted by frantaylor View Post
            In other words it's the new "Microsoft Access": quickly create inefficient applications that use huge resources, are unscalable, and must be re-written from scratch when they need to be scaled.

            Toys for boys
            I don't want to be a MS poofter, but it seems that you never understand what MS Access was and/or what Entity is. If you will say that Visual Studio Lightswitch is the MS Aceess, it would make some sense (even is a loose comparison).
            If you meant that Entity is the new ADO.Net (which is probably what you think of), I agree, it is built on top of Ado.Net, it behaves well with Linq and do a lot of things transparently.

            What is so unscalable with Entity? Entity is used inside of Bing, which even is not Google, is still scalable, isn't so? You can't execute it from multiple threads? The overhead of CLR reflection is too big?
            Did you try to make a 64 thread application and run it on a machine like this one?

            Even it would be not so scalable (even this wording seems is fairly reduntant), most of the time the database backend (the RDBMS itself) has to be performant, not the thin client querying the response.

            Or for you Mono is not scalable, and based on that, you mean that Entity will be just another bloat...
            This means that you run most likely a browser that it's JS runs at 1/2 of speed of Mono, in the best case (when is close to C like coding), and its SQLite database doesn't scale, means that we all running slow inefficient applications, right, right?
            So for this reason you use text based console browser?

            At the end, are you a developer? Are you a troll that it happen to dislike Mono?
            If you're using Android phone, are you against Android VM which is much slower than Mono For Android. If you're using Ubuntu, Fedora and not only, a lot of your tools are written in the slow Python which have Global Lock, which this is for really unscalable and slow in multithreaded applications.
            Last edited by ciplogic; 15 August 2012, 04:08 AM.

            Comment


            • #56
              I just installed one of our MVC3 apps to Linux/Apache/Mono. It worked. If Mono saves me money on server OS costs, I'm all for it. As far as EF is concerned, I see it like Rails. If the ORM helps me solve small-med projects fast I'll use it. I'm happy to see EF on Linux!

              Comment


              • #57
                Oh and anyone here use Hibernate? I have for some very large projects. The cost needed for performance HQL tuning/lazy loading etc was more than made up in a build environment that supported quick and easy schema changes which in turn made it super easy to fold in new devs quickly with a minimal amount of domain specific training to get them up and running.

                Of course you can't just toss ORM at a large project and hope it works!!!

                Comment


                • #58
                  Originally posted by droste View Post
                  Why would I try to do that? I don't see a reason to get all users and especially not with all colors. You would setup the color to be lazy loaded so you don't have the overhead when you just want the user.
                  You know that query execution time depends on the number of rows IN THE TABLE not the number of rows returned from the query, right? If you have 15 million rows in a table, it's going to take a LOT longer to select one row than a table with just a few entries?

                  That's what the ORM is for ;-) You don't do it, it does it for you.
                  NO, ORM does NOT manage your data for you. It does NOT make decisions about how to store your data based on how much data you have. ORM stands for "Object-Relational Model". ALL it does is turn your object into SQL rows and then back again. It does NOT pick your indexes (it probably makes a primary key, but that's not enough for your color table), it does NOT tell you that your color table is a lousy design.

                  There a only a few applications that have to handle the workload facebook or google have.
                  How do you know this? What about sites that have finite budgets for computer hardware? They must accomodate their users with limited hardware so they must squeeze performance wherever they can.

                  If you actually want to enforce constraints on a busy database, you OFFLOAD the constraint calculations to another system. You can do this by making the data available to other apps via SOAP or some such, and write a small server that does the constraint calculations and then stores the data in the database.

                  You don't want your database engine to be calculating constraints! You want your database to be fetching and storing data.

                  suddenly changing some datatypes or formats for example
                  This is another reason for using a SOAP-like service for applications: you don't WANT your applications to know or care about how the data is stored in your database. If you expose SQL then you will be constantly updating your apps as you change the schema, unless you use SQL views, but they slow things down, too.

                  Also SQL data permissions often don't line up with your application. You will encounter situations where SQL permissions are not able to adequately represent your security requirements. In these cases you must write an intermediate agent to do the security for you, which would be part of the SOAP engine I mentioned above.

                  Separating the mechanics of the data storage from the application is a good design principle. If your stated worry is real, this is how you deal with it.

                  Oh and by the way: did you understand why I said "you have choices" when I described the "favorite colors" requirement? It's because your way of doing it is the obvious and slow one. Like I said, bitmap indexes can convey the same information and they are much much faster than your method.
                  Last edited by frantaylor; 15 August 2012, 10:55 AM.

                  Comment


                  • #59
                    Originally posted by frantaylor View Post
                    "CREATE TABLE myData (K INTEGER AUTO_INCREMENT PRIMARY KEY, D VARCHAR(262144))"
                    d = JSON.encode(myObject);
                    sql.execute("INSERT INTO myData(D) VALUE (?)", d)
                    key = sql.lastInsertedKey ...
                    ---
                    sql.execute("SELECT D FROM myData WHERE K=?", key)
                    myObject = JSON.decode(sql.fetch)
                    ---
                    This is where my forehead slammed into my palm.

                    Originally posted by frantaylor View Post
                    really if you want an "entity framework" or an "ORM" for a quick demo application you can just use a two-column table and JSON
                    Before the railing.... For the record, I don't use ORM. I rolled my own interfaces to make MySQL and Oracle databases _look similar_ in basic access, but I still write my SQL.

                    Your JSON example is an extremely poor solution to problems ORM strives to solve. It isn't as simple as object preservation, you have to be able to get the data out, and compare it to other data structures. Lets say I'm an online book retailer and I want to look at all Mystery books under $10. Using your solution, I'm basically going to have to load the entire database into RAM and then de-serialize each and every record (parsing text, allocating memory) it back into an object, before I can finally filter it by the criteria I'd like.

                    Even a basic key-value pair schema (which is what Microsoft's "Entity Framework" does) is a better option... and with that, good luck when you have schemas with trillions of rows.

                    Real ORM is a developer written class that acts as an extension to an ORM framework, implementing very specific interfaces. This class handles all custom logic for your database schema in an optimized and schema-specific manner. Once you've done this work, you can reap the benefits of not having to do it again... every query being optimized for you, automatically. Your solution solves only problems that personal websites encounter (and poorly), not something with decent volume.

                    Comment


                    • #60
                      Originally posted by frantaylor View Post
                      You know that query execution time depends on the number of rows IN THE TABLE not the number of rows returned from the query, right? If you have 15 million rows in a table, it's going to take a LOT longer to select one row than a table with just a few entries?
                      No not right, that is not how database queries work. The time the database spends to find the row is not the only thing that costly. The overall speed depends on how many data is in the table, what data you want from the row, how you search for one row and on how you transport the data to the application.

                      A few examples:
                      - Imaging the following DB2 SQL query: SELECT myPrimaryKey FROM myTable FETCH FIRST 1 ROWS ONLY. This will be fast no matter how many entries are in the table.
                      - When you only query data that is indexed the database can highly reduce disk I/O (the slow part).
                      - If you have a network in between you and the database preselecting the data you need can highly reduce your network traffic.

                      Have you ever actually used a database? And not every program written presents you with a website viewable via the internet. There are many programs out there that just present data from a DB within a thin client to the user. It's not a good idea to always send all information you have to the client, because it slows down the client really fast.

                      Comment

                      Working...
                      X