Announcement

Collapse
No announcement yet.

KDE's Nepomuk Doesn't Seem To Have A Future

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • kevinf28
    replied
    Originally posted by justinzane View Post
    Code:
    for i in $(find -iregex '.*\.jpe?g'); do exiftool $i | grep 'D200'; done
    ...
    • bzgrep
    • deepgrep
    • egrep
    • fgrep
    • grep
    • lzegrep
    • lzfgrep
    • lzgrep
    • msggrep
    • orc-bugreport
    • pcregrep
    • pgrep
    • plugreport
    • wcgrep
    • xzegrep
    • xzfgrep
    • xzgrep
    • zegrep
    • zfgrep
    • zgrep
    • zipgrep


    ...
    thanks for the code snippet and useful grep's.

    Here is the biggest problem with society: LAZY! Nepomuk shouldn't be strictly needed...

    "Say you received a photo from a friend of yours, 2 weeks ago. You saved it somewhere on your computer. Now how to you find that file? If you don't remember the location, you're out of luck." http://userbase.kde.org/Nepomuk

    Its not that hard to organize your damn files... /home/user/pictures/2014/trip_to_x /home/user/pictures/2013/downloads_from_net /home/user/pdfs/categoryXYZ /home/user/projects/xyz.

    There is no reason to NEED to search /home/* if you have ANY sense of organization skills... So therefore, indexers are not very useful.

    Email, on the other hand, sure, index it, cache it, metadata, tag the crap out of it. Google Desktop Search is WAY better then Outlook for searching my Exchange email at work for instance. and that is more useful than organizing emails into folders.

    Leave a comment:


  • erendorn
    replied
    Originally posted by justinzane View Post
    For "proper" PDFs, those that are generated by document editing/creation apps from libreoffice.org Writer to Adobe FrameMaker, absolutely. Unfortunately, I have seen **way** too many "fake" PDFs that are noting more than scanned images in a PDF wrapper. I've struggled to get unpaper/tesseract/gocr/ocropus/etc. to work effectively on documents that I have carefully scanned. I would make me most surprised, and quite thrilled, to find that Nepomuk was able to effectively OCR "bogus" PDFs.
    I don't think anybody suggested that Nepomuk was able or even asked to do any OCR.

    Leave a comment:


  • erendorn
    replied
    Originally posted by justinzane View Post
    Oh, and as far as waiting for find/grep to do their thing... That is what multitasking is all about... multiple konsole tabs, multiple firefox tabs, kmail, akregator, kpat...
    Multitasking has nothing to do with it. Multitasking lets you do other things while you wait, but it does not reduce the wait in any way.

    Caching reduces wait. Just like firefox maintains a copious ram and disk cache, and kmail stores local copies of imap mails, and akregator stores local copies of RSS data.

    For searching purposes, it is called indexing.

    Leave a comment:


  • justinzane
    replied
    OCR? Really?

    Originally posted by molecule-eye View Post
    Content searching, eg of pdfs,
    For "proper" PDFs, those that are generated by document editing/creation apps from libreoffice.org Writer to Adobe FrameMaker, absolutely. Unfortunately, I have seen **way** too many "fake" PDFs that are noting more than scanned images in a PDF wrapper. I've struggled to get unpaper/tesseract/gocr/ocropus/etc. to work effectively on documents that I have carefully scanned. I would make me most surprised, and quite thrilled, to find that Nepomuk was able to effectively OCR "bogus" PDFs.

    Leave a comment:


  • erendorn
    replied
    Originally posted by liam View Post
    I worded my responses carefully so as to avoid people inferring just what you did.
    In large metro areas trains are still heavily used (well, in the northeast at any rate) but pretty much everywhere else cars have just utterly supplanted trains as the primary people mover.
    Yes exactly: in some places, for some uses (transporting people), in terms of market share, far from disappearing.
    Hey, even in the US, freight transportation by rail is increasing in volume and market share right now.
    Hence, train not disappearing.
    I'm not even sure about the supplanted. Were people traveling as much before the car? I don't think so. The car has increased the transportation market, enabling new uses. In such case, it is not supplantation, but independent growth in the same market.
    Do people take the car more than the train: yes. Do people take the train less than they took the train before the car? Much less so.

    And the analogy seems to match pretty well what is happening to PCs (laptops and desktops).

    Leave a comment:


  • justinzane
    replied
    Better than spending 6 Months tagging gigabytes of files...

    Originally posted by TheBlackCat View Post
    Yeah, I am sure most users would love to spend an hour waiting for grep to scan through gigabytes of files. And how, exactly, do you expect to use grep on a binary word processor document or image metadata? The point of modern search programs is that they use a index so you don't need to scan each file line-by-line one-at-a-time.
    Code:
    for i in $(find -iregex '.*\.jpe?g'); do exiftool $i | grep 'D200'; done
    Of course, that is kind of a silly example. However, when you have maybe 100,000 images from several photographers, some from digital cameras, like the Nikon D200 in the example, some from scanned prints, some from scanned slides/negatives, etc. programmatic (sp?) searching using exiftool, dcraw, imagemagick, and bash/zsh are very useful, though not self-evident by any means. While finding a copyright line in exif data is simple, finding pictures of racoons in trees by looking for images with an average color balance within a tolerance of a known sample image is far from exact.

    However, until I am able to go through all the images and delete the garbage and tag the rest, there is no way for Nepomuk/Baloo/whatever to index anything that I cannot find other ways. Add to that the inability to effectively and simply share tags and do automated, iteratively corrected tagging of various image and text based files; and it seems to be quite limited.

    As far as "binary" documents, the various and sundry variants of grep make life easier:
    • bzgrep
    • deepgrep
    • egrep
    • fgrep
    • grep
    • lzegrep
    • lzfgrep
    • lzgrep
    • msggrep
    • orc-bugreport
    • pcregrep
    • pgrep
    • plugreport
    • wcgrep
    • xzegrep
    • xzfgrep
    • xzgrep
    • zegrep
    • zfgrep
    • zgrep
    • zipgrep

    since most binary formats are simply zip/gzip compressed text (dtf, xml, json, yaml, ini, csv, tsv, etc.) based files.

    Further, there are almost countless other uncommon but significant file formats that Nepomuk has no knowledge of and cannot index. For those into cartography, grep can tell me which ESRI shapefile(s) might have info about Grenada, CA. nepomuksearch cannot.

    Code:
    [Wed 14/02/19 08:51 UTC][pts/3][x86_64/linux-gnu/3.12.9-2-ARCH][5.0.5]
    <[email protected]:~/downloads/osm_CA>
    zsh/2 1160 % grep "Grenada" *
    Binary file places.dbf matches
    Binary file points.dbf matches
    Binary file roads.dbf matches
    [Wed 14/02/19 08:52 UTC][pts/3][x86_64/linux-gnu/3.12.9-2-ARCH][5.0.5]
    <[email protected]:~/downloads/osm_CA>
    zsh/2 1161 % nepomuksearch "Grenada"
    (Note final blank line.)

    Though it might be possible to use data gleaned from files to generate "fingerprints" and search a global dataset to determine likely tags; this has obvious privacy implications. All told, it seems that while the functionality provided by Nepomuk/Baloo is probably quite useful for many users; it seems to be something that is optional to a basic desktop environment. The KDESDK apps are not core; neither should search be.

    Oh, and as far as waiting for find/grep to do their thing... That is what multitasking is all about... multiple konsole tabs, multiple firefox tabs, kmail, akregator, kpat...
    Last edited by justinzane; 19 February 2014, 05:05 AM. Reason: clarification

    Leave a comment:


  • molecule-eye
    replied
    I use Nepomuk and think the ability to add comments, tags, and ratings to arbitrary file that's quickily and "easily" searchable quite useful, especially since you can move the files around without breaking any of the associated metadata.

    Two problems I see with nepomuk. The first is that Dolphin doesn't provide an easy way to search via tags or comments or ratings. You have to figure out some baroque code query to punch into the address bar to use this feature. The ability to save "smart searches" would be awesome (or is this available and I don't know it?)

    The second is that more software needs to make good use of nepomuk. A plugin for Amarok was recently developed but it's far from ideal and I wouldn't use it. Bangarang is no longer being developed it appears, and what else is there? Gwenview is the only core piece of KDE software making decent use of Nepomuk.

    Content searching, eg of pdfs, using nepomuk is about a million times less useful than using Recoll, which I'm using and love. If there were a better "advanced search" mode in Dolphin it might be useful but just getting a pile of results totally unfiltered for relevance or anyhing is pretty useless.

    The idea behind the semantic desktop is brilliant and I hope something like nepomuk stays around in kde for the indefinite future.

    Leave a comment:


  • liam
    replied
    Originally posted by anda_skoa View Post
    Or for medium distances when flight "overhead" (security, boarding, disembarcation, baggage claim) are a significant part of the travel time.
    E.g. a train connecting two major cities at a distance of 200km can do the trip in an hour and even less. A flight would not be faster but considerably less convenient.

    Cheers,
    _
    When I was in Europe with my family I recall flights actually being cheaper than trains (between Italy and Belgium for instance). I've been really disappointed in how expensive trains are in general. I'm still a fan of high speed rail between major cities b/c they are so efficient.

    Leave a comment:


  • anda_skoa
    replied
    Originally posted by liam View Post
    Again, that's why I said in certain places. Trains make perfect sense in small, very densely populated areas (like the northeastern us) but everywhere else cars/aircraft are the answer.
    Or for medium distances when flight "overhead" (security, boarding, disembarcation, baggage claim) are a significant part of the travel time.
    E.g. a train connecting two major cities at a distance of 200km can do the trip in an hour and even less. A flight would not be faster but considerably less convenient.

    Cheers,
    _

    Leave a comment:


  • liam
    replied
    Originally posted by TheBlackCat View Post
    In the U.S. Not so much in Europe, where the train system is much more developed.
    Again, that's why I said in certain places. Trains make perfect sense in small, very densely populated areas (like the northeastern us) but everywhere else cars/aircraft are the answer.

    Leave a comment:

Working...
X