×

Announcing: Slashdot Deals - Explore geek apps, games, gadgets and more. (what is this?)

Thank you!

We are sorry to see you leave - Beta is different and we value the time you took to try it out. Before you decide to go, please take a look at some value-adds for Beta and learn more about it. Thank you for reading Slashdot, and for making the site better!

Why My Team Went With DynamoDB Over MongoDB

timothy posted about 2 years ago | from the mostly-for-the-web-scale-of-it dept.

Software 106

Nerval's Lobster writes "Software developer Jeff Cogswell, who matched up Java and C# and peeked under the hood of Facebook's Graph Search, is back with a new tale: why his team decided to go with Amazon's DynamoDB over MongoDB when it came to building a highly customized content system, even though his team specialized in MongoDB. While DynamoDB did offer certain advantages, it also came with some significant headaches, including issues with embedded data structures and Amazon's sometimes-confusing billing structure. He offers a walkthrough of his team's tips and tricks, with some helpful advice on avoiding pitfalls for anyone interested in considering DynamoDB. 'Although I'm not thrilled about the additional work we had to do (at times it felt like going back two decades in technology by writing indexes ourselves),' he writes, 'we did end up with some nice reusable code to help us with the serialization and indexes and such, which will make future projects easier.'"

Sorry! There are no comments related to the filter you selected.

That's different... (2, Funny)

Anonymous Coward | about 2 years ago | (#42970551)

They must run their company pretty different than where I work.

Where I work, the most senior and backstabby developer saddles the worst tools he can find on the rest of the team, and then blames them (behind their backs of course) for the results of his poor decision making.

I don't understand (3, Funny)

Anonymous Coward | about 2 years ago | (#42970553)

But MongDB is web scale.

Re:I don't understand (4, Funny)

OakDragon (885217) | about 2 years ago | (#42970835)

MongoDB ... just a pawn in the game of life.

Re:I don't understand (-1)

Anonymous Coward | about 2 years ago | (#42971985)

MongoDB ... just a pawn in the game of life.

Now that is funny!!

Re:I don't understand (1)

tralfaz2001 (652552) | about 2 years ago | (#42974573)

Please for the love of god tell me I'm not the only one that got this Blazing Saddles reference. Well done sir.

Re:I don't understand (1, Funny)

K. S. Kyosuke (729550) | about 2 years ago | (#42970843)

Haven't you seen the newest succ (succ (succ (succ (succ (succ (succ Zero)))))) movie, "The Web is not enough"?

Re:I don't understand (0)

Anonymous Coward | about 2 years ago | (#42970965)

Spanish balloons? Mongo take chance...!

No one cares (5, Insightful)

Anonymous Coward | about 2 years ago | (#42970597)

No one cares. Stop click-baiting the buzzword Slashdot sub-sites. If we wanted to go to them we would do so voluntarily.

Re:No one cares (1, Funny)

Anonymous Coward | about 2 years ago | (#42971153)

But I want Dice to tell me all the ways in which backend specialists are critical to online games!

Hoopty Doo (-1, Troll)

Anonymous Coward | about 2 years ago | (#42970677)

Why are we reposting this faggots blog on /.?

Re:Hoopty Doo (-1)

Anonymous Coward | about 2 years ago | (#42970985)

Non-contributing slashdot fuck is incensed that someone else got attention. News at eleven.

News for Trolling Nerds (-1)

Anonymous Coward | about 2 years ago | (#42971311)

It's the new Slashdot business plan, rustle up interest by irritating the few remaining techies.

MongoDB with ObjectRocket FTW (0)

Anonymous Coward | about 2 years ago | (#42970787)

ObjectRocket has a pretty awesome solution and the dudes there know their sh!t: http://www.objectrocket.com/

Re:MongoDB with ObjectRocket FTW (0)

Anonymous Coward | about 2 years ago | (#42971021)

This post is sponsored by /dev/null Enterprises.

devs and DB indexes (1)

alen (225700) | about 2 years ago | (#42970811)

there are two kinds

the first creates a 10,000,000 row table with no indexes, no PK and then complains that the DBA's are dumb because the app is slow or the server is broke

the second kind i've seen have a 100 row table, with 10 columns and 15 indexes on it. sometimes half my day is spent on deleting unused indexes created by our BI devs

Re:devs and DB indexes (1)

Anonymous Coward | about 2 years ago | (#42971045)

Wait, so half of your day you is done by a cron job? Really? So you just go hide in a closet or take lunch while the job does these deletes? Are you French?

Re:devs and DB indexes (1)

larry bagina (561269) | about 2 years ago | (#42971543)

Huh? What does being bisexual have to do with it?

Re:devs and DB indexes (0)

Anonymous Coward | about 2 years ago | (#42971937)

If that's really the case, encourage them to use ORM. A descent one will do better job than this.

Re:devs and DB indexes (1)

CadentOrange (2429626) | about 2 years ago | (#42972025)

An ORM isn't a silver bullet. You still need to understand how your objects map onto the database or you're back to square one with a poorly performing database. In fact it's probably worse as you then need to figure out what the database *and* ORM are doing.

Re:devs and DB indexes (0)

Anonymous Coward | about 2 years ago | (#42973315)

If your objects map to the database with any significant regularity then you're doing both layers wrong.

ORM is a disease. Step 1 to being cured is admitting you have a problem.

Fools (0)

Anonymous Coward | about 2 years ago | (#42970827)

Fools! Everyone knows DynamoDB isn't Web Scale. [youtube.com]

MONGO SAY, THAT BE A DYN-O-MITE !! (-1)

Anonymous Coward | about 2 years ago | (#42970875)

Mongo Jurry says let Arias free !! Let her chop suey her BFs if she wants !! Let her be FREE to do as she pleases !! She has had a hard life !!

Worried about hosting data alongside others... (3, Insightful)

Anonymous Coward | about 2 years ago | (#42970893)

"Our client is paying less than $100 per month for the data. Yes, there are MongoDB hosting options for less than this; but as I mentioned earlier, those tend to be shared options where your data is hosted alongside other data."

I think someone failed to explain how "the cloud" actually works.

It's so ... wrong (5, Insightful)

Anonymous Coward | about 2 years ago | (#42971023)

Having actually RTFA, it just enforces how poorly most programmers understand relational databases and shouldn't be let near them. It's so consistently wrong it could be just straight trolling (which given it's posted to post-Taco Slashdot, is likely).

"However, the articles also contained data less suited to a traditional database. For example, each article could have multiple authors, so there were actually more authors than there were articles."

This is completely wrong, that's a text book case of something perfectly suited to traditional (relational) database.

Re:It's so ... wrong (1)

Anonymous Coward | about 2 years ago | (#42971099)

NoSQL is a buzzword meaning "too dumb to understand a RDB". That's why they poorly reinvent the wheel.

Re:It's so ... wrong (5, Funny)

MightyMartian (840721) | about 2 years ago | (#42971103)

"Those who don't understand SQL are condemned to reinvent it, poorly." (with apologies to Harry Spencer).

Re:It's so ... wrong (5, Insightful)

Torvac (691504) | about 2 years ago | (#42971799)

"with big data comes big responsibility". i mean a few very static 100k items require a NoSQL DB solution and cloud storage ? and a full team to do this ?

Re:It's so ... wrong (4, Insightful)

Tom (822) | about 2 years ago | (#42971915)

Mod parent up.

After a few years in other fields, I'm doing some serious coding again. Postgres and Doctrine. I can do in a few lines of code and SQL what would take a small program or module to do without the power of SQL and an ORM.

Anyone who reinvents that wheel because he thinks he can do the 2% he recoded better is a moron.

Re:It's so ... wrong (-1)

Anonymous Coward | about 2 years ago | (#42973353)

You had me up until the ORM part. In my experience people who use an ORM are doing *both* layers horribly wrong.

Other than that, I agree with you.

Re:It's so ... wrong (1, Insightful)

PRMan (959735) | about 2 years ago | (#42974023)

Yeah. Going without ORM you typically get a minimum of 50% better.

Re:It's so ... wrong (1)

UnknownSoldier (67820) | about 2 years ago | (#42973329)

I know you jest but sometimes you DO want to re-write SQL. i.e. row store vs column store.

NewSQL vs. NoSQL for New OLTP
http://www.youtube.com/watch?v=uhDM4fcI2aI [youtube.com]

One Size Does Not Fit All in DB Systems
http://www.youtube.com/watch?v=QQdbTpvjITM [youtube.com]

Re:It's so ... wrong (3, Insightful)

MightyMartian (840721) | about 2 years ago | (#42973607)

I jest slightly. Certainly there are applications where SQL and relational systems in general are overkill, or where they do not solve certain kinds of problems well. But I'll be frank, they're pretty rare. I will use binary search/sort mechanisms for simple hashes and other similar two column key-value problems, mainly because there's absolutely no need to truck along gazillions of bytes worth of RDBMS where quicksort and a binary search is all that is needed. But if you get beyond that, you're almost inevitably going to start wishing you had JOIN? And then you end up having to implement such functionality.

Every tool for the job, to be sure, but I just happen to think there are far fewer problems that nosql style systems solve than some like to think.

Re:It's so ... wrong (1)

Nbrevu (2848029) | about 2 years ago | (#42977249)

Every tool for the job, to be sure, but I just happen to think there are far fewer problems that nosql style systems solve than some like to think.

I strongly agree with this, and because of that I've been severely chastised by quite a few kool-aid drinkers. On my current job we have a NoSQL database (a MongoDB one, actually) and we indeed have had to reinvent some SQL here and there, including a few manual joins. The job would just have been far smoother (and faster to develop), and surely more performant, if we used a well-established SQL database, but someone decided that it wasn't buzzwordy enough.

Re:It's so ... wrong (5, Funny)

vux984 (928602) | about 2 years ago | (#42971465)

"However, the articles also contained data less suited to a traditional database. For example, each article could have multiple authors, so there were actually more authors than there were articles."

Good god, how would he model invoices with multiple line items? Where, you know, there were actually more line items than invoices?! Mind blown.

Or customers that might belong to zero more demographics? There could be more customers than defined demographics to tag them with... or less... we don't even know and it could change as more of either are added!!

We need a whole new database paradigm!

Or the sample Northwind database that's been shipping with access since the 90's.

Re:It's so ... wrong (1)

Fnord666 (889225) | about 2 years ago | (#42974679)

We need a whole new database paradigm!

Wait, don't you just draw a different arrow on the end of the line joining the two tables and the rest happens automatically?

Re:It's so ... wrong (2)

hey (83763) | about 2 years ago | (#42971879)

Make a table of authors, make a linking table that joins authors to the article table.

Re:It's so ... wrong (2)

serviscope_minor (664417) | about 2 years ago | (#42972443)

This is completely wrong

No, it's completely right: the traditional way to use a database is to blob everything together in to one huge table, preferably with many NULLs, then limit your query to SELECT * FROM Table; and finally process the results directly in VB6, with bonus points for a buggy parser for unpicking comma separated fields.

Note: he said "traditional" not "sane relational".

Sarcasm aside, his reason for not using a relational database is that he'd need to use more than one table and then he'd have to perform joins on them, which sounds very much like saying the reason not to use SQL is because the problem fits exactly into what SQL is designed to do.

But hey, his new solution is in the cloud so it must be better.

Re:It's so ... wrong (1)

C10H14N2 (640033) | about 2 years ago | (#42973107)

No, no, no, you let your tedious "DBAs" think they're right and do all that "normalization" and "tuning" shit they keep yammering on about (whatevs), then get the new shiny [microsoft.com] so you can blob the whole fucker up and never have to worry about anything but said "SELECT * FROM FOO." It's great because our developers no longer have to talk to our DBAs about "optimizing" all that dynamic SQL our webforms were generating. The DBAs are now screaming about resource utilization, but, HELLO, they're the ones who insisted on building all those freakin tables in the first place when everyone knows you just need one to throw in all the XML. Idiots.

Re:It's so ... wrong (0)

Anonymous Coward | about 2 years ago | (#42972449)

His C# vs. Java article for the "real world" isn't any better. His whole argument basically starts at C# vs. Java where it seems like C# is doing better and quickly switches the topic to Tomcat vs. IIS for web development while maintaining that this is purely about C# vs Java, and declares Java the obvious winner.

Re:It's so ... wrong (1)

MurukeshM (1901690) | about 2 years ago | (#42976803)

Oh, that moron? 10 minutes wasted checking the comments to see if TFA is worth reading..

Re:It's so ... wrong (1)

MatthiasF (1853064) | about 2 years ago | (#42973135)

For normalized databases, this is often considered a best practice, although another option would be to store multiple author IDs in the article tables—something that would require extra fields, since most articles had more than one author. That would also require that we anticipate the maximum of author fields needed, which could lead to problems down the road.

A single field with delimited index keys pointing to an author table. I learned that in 1996. Then compressing the field with a dictionary, increasing the number of keys that can fit and speed up searches through it. Learned that in 1998.

Why does that not work in NoSQL? I don't understand.

Re:It's so ... wrong (2)

tgd (2822) | about 2 years ago | (#42973819)

Having actually RTFA, it just enforces how poorly most programmers understand relational databases and shouldn't be let near them. It's so consistently wrong it could be just straight trolling (which given it's posted to post-Taco Slashdot, is likely).

"However, the articles also contained data less suited to a traditional database. For example, each article could have multiple authors, so there were actually more authors than there were articles."

This is completely wrong, that's a text book case of something perfectly suited to traditional (relational) database.

Well, based on how many things are wrong in the Java vs C# comparison, too, one can only guess that the "software developer" is just some hack who is comped by Slashdot to drive clicks to their sub-sites.

Man this place has really gone to shit in the last year -- just a waste of time to read. Sucks its hard to break 15 years of habit ...

Re:It's so ... wrong (0)

Anonymous Coward | about 2 years ago | (#42974395)

I couldn't but laugh and sigh at both the article and the submission. Kids these days...

I prefer MongoDB because (-1)

Anonymous Coward | about 2 years ago | (#42971037)

Jag är mongo

Du är också mongo

Vi bor i ett cykelförråd i Korpilombolo

Rånar banker

Slår ner folk med plankor

Sprider skräck i stan

Följ med oss och bli vandal

Om du säger till nå'n att vi brukar pressa folk på stålar

Då piskar vi dig med en flugsmälla så att du vrålar

Vi är maffia med läderkepps

Vi använder alltid våra biceps

Vi skjuter med kulsprutorna

så du måste dansa

naken med läderkepps med en transa

Men om du är mongo då följer du med oss

Joina våran maffia sätt i gång och slåss

Röka maja med oss

I cykelförrådet förståss
 

You ins3nsitive clod! (-1)

Anonymous Coward | about 2 years ago | (#42971041)

stand anymo8e, bben many, not the

Am I the only one (0, Offtopic)

spatley (191233) | about 2 years ago | (#42971143)

that is getting sick of this content-free, slashdot echo chamber, clickcrack stuff. Hey Slashdot, why do you need whole nuther site to post original articles? And why do those articles make such a deafening sucking sound?
Problem is that I would be interested in a reasoned look at MongoDB v Dynamo but my experience with http://slashdot.org/topic/bi/ [slashdot.org] is not to waste my time by reading TFA.

So the gist of the article is..... (4, Informative)

f-bomb (101901) | about 2 years ago | (#42971169)

MongoDB would have been perfect based on the structure of the data, but the client didn't want to pay for setup and hosting costs, DynamoDB was the cheaper alternative, but more of a pain in the ass to implement. Makes we wonder if the hosting cost savings offset the additional development time.

Question from relational-land (4, Informative)

mcmonkey (96054) | about 2 years ago | (#42971183)

As someone whose work and thinking are firmly planted in traditional RDMS, a few of those decisions did not make sense.

I understand what he's saying about normalized tables for author, keywords, and categories. But then when he has to build and maintain index tables for author, keyword, and categories, doesn't that negate any advantage of not having those tables?

I understand he's designed things to easy retrieval of articles, but it seems the trade-offs on other functions are too great. It's nice an author's bio is right there in the article object, but when it's time to update the bio, that does mean going through and touching every article by that author?

I've I got a bunch of similar examples, and I would not be at all surprised if they all boiled down to 'I don't understand what this guy is doing,' but basically, isn't NoSQL strength in dealing with dynamic content and in this example, serving static articles, the choice between NoSQL and traditional RDMS essentially up to personal preference?

Re:Question from relational-land (1)

Anonymous Coward | about 2 years ago | (#42971527)

Maybe you should factor in the usage pattern and instance counts as well.

Someone's bio might appear in how many articles? A few hundred? And how often will the bio be updated? A couple of times a year? So, updating a bio comes down to touching a few hundred records a few times a year. Compare that with thousands of accesses per day and you've suddenly tipped the scale.

Re:Question from relational-land (1)

Anonymous Coward | about 2 years ago | (#42971607)

So... what you're saying is that the application needs a materialized view after benchmarks show that joining against the authors table is a performance bottleneck?

Re:Question from relational-land (4, Insightful)

ranton (36917) | about 2 years ago | (#42971695)

Oh come on now. Play fair. If you start throwing around advanced database features like materialized views then you will immediately invalidate 90% of the use cases commonly used for choosing NoSQL over relational databases. That is just mean.

Re:Question from relational-land (1)

godefroi (52421) | about 2 years ago | (#42980511)

Oracle's "snapshots" were renamed to "materialized views" in 1999, MSSQL gained "indexed views" in 2005, MongoDB "began development" in 2007.

Doomed to reinvent it, indeed.

Re:Question from relational-land (3, Informative)

mcmonkey (96054) | about 2 years ago | (#42971787)

Maybe you should factor in the usage pattern and instance counts as well.

Someone's bio might appear in how many articles? A few hundred? And how often will the bio be updated? A couple of times a year? So, updating a bio comes down to touching a few hundred records a few times a year. Compare that with thousands of accesses per day and you've suddenly tipped the scale.

That's exactly the sort of answer I was looking for. Thank you. (Actually, I'd expect most bios get updated only a handful of times over the life of the author. You start with first publications as a grad student, then you leave school, maybe change jobs a couple of times, maybe a few notable achievements, then the author dies.)

That is the sort of design considerations I'd like to read about. That would give a useful comparison between platforms. As it is, this article boils down to "I went NoSQL over RDMS, because...well, just because. I went Amazon over something else because it's easier for my idiot client to administer."

Re:Question from relational-land (1)

ScriptedReplay (908196) | about 2 years ago | (#42976995)

Someone's bio might appear in how many articles? A few hundred? And how often will the bio be updated? A couple of times a year? So, updating a bio comes down to touching a few hundred records a few times a year. Compare that with thousands of accesses per day and you've suddenly tipped the scale.

That would make sense if you had to pull bios with an article, which should hardly be the case. At most, you'd have to pull in current authors' affiliations. A bio would ideally stay behind an author link, and be pulled in quite rarely. I for one would much rather have a list of authors immediately followed by the abstract than having to move through several pages of biographies for an article with 4-5 authors in order to find the abstract an the actual article. So for me the decision to put every bio in every article looked like a poorly researched one. YMMV and all that.

Re:Question from relational-land (5, Insightful)

ranton (36917) | about 2 years ago | (#42971603)

Don't try to actually make sense of the decisions made in the article. I am glad that he summed up all of the reasons why he didn't go with a relational database early in the article, so I didn't have to bother reading the rest. I am an advocate of NoSQL, but this whole article is describing a project that is almost perfect for a relational database.

But considering this author's previous analysis of Java vs C#, I am not surprised that this article was hardly worth the time to read.

Re:Question from relational-land (1)

adnonsense (826530) | about 2 years ago | (#42979725)

Don't try to actually make sense of the decisions made in the article. I am glad that he summed up all of the reasons why he didn't go with a relational database early in the article, so I didn't have to bother reading the rest. I am an advocate of NoSQL, but this whole article is describing a project that is almost perfect for a relational database.

Heck yeah, it reminds me of a project I did in 2004 or 2005, which stored over a hundred thousands of articles (some of them more than 64Kb!) with multiple authors, keywords and other fancy schmancy stuff. I've no idea what "a good amount of traffic from a niche group of scientists and researchers means in real terms, but the system I put together was getting something like 40,000 unique vistors a day, running off some not particularly spectacular hardware (this was a time when 1GB was a lot of memory). As there was no NoSQL back then, I had to "make do" with a proper relational database (PostgreSQL), which wasn't exactly a speed demon at the time, but very kindly took care of things like indexes and keeping things in sync (aka "relational integrity") leaving me free to concentrate on optimizing the whole stack. Oh yes, it was only me on the "team". And I managed to bodge a Lucene-based search system into the setup (as PostgreSQL's full-text search was a bit sucky).

I suppose what with it being 2013 and such, it would be possible to push it into the cloud and squeeze in some JSONy bits as well if necessary

.

Kids of today, eh...

Re:Question from relational-land (0)

Anonymous Coward | about 2 years ago | (#42972373)

(I'm not a professional developer, just an observer from the sidelines.)

So far the only sound use cases I've heard for NoSQL are things like Amazon and Google where:

1. They have very large data sets.
2. They have management that can make an educated business decision about what kind of guarantees they do and don't need to make to their customers.
3. They have people inhouse who have a strong understanding of the underlying CS trade offs in the design of database systems who can see how to maximize performance while still maintaining the right guarantees.

Outside of that, every pitch for NoSQL I've heard sounds like people are getting in way over their heads and won't realize it until way too late.

Re:Question from relational-land (1)

frank_adrian314159 (469671) | about 2 years ago | (#42972375)

It's nice an author's bio is right there in the article object, but when it's time to update the bio, that does mean going through and touching every article by that author?

Actually, you don't update the biographical information for an article. The biographical information in the article is supposed to reflect the biographical information for the author at the time at which the article is published. When you update the biographical information, it goes into any articles published after the bio is updated. Unless, of course, you want to have a completely different paradigm of publishing than that established in the days of hard copy (which may be a good thing, but is not what is done now). In fact, previous employers for the author may get quite irate that research funded and published by their institution no longer mentions the same because the author has moved on.

No, it's not as simple as it looks. Thanks for asking...

Re:Question from relational-land (1)

MurukeshM (1901690) | about 2 years ago | (#42976829)

Ars Technica follows the non-traditional way, and personally, only nostalgia would be a reason to retain the original bio.

Re:Question from relational-land (0)

Anonymous Coward | about 2 years ago | (#42972807)

I have no experience in NOSQL, but it is just a glorified Key/Value store. You could just use a reference to that user's bio instead of embedding it. The key could be something like the UserID+"Bio". I am not sure the preferred method to create a object reference that doesn't change based on the value, but I'm sure this would work.

Bad planning (5, Interesting)

Samantha Wright (1324923) | about 2 years ago | (#42971217)

Throughout the article the client says they don't want full-text search. The author says he can "add it later," then compresses the body text field. Metadata like authorship information is also stored in a nasty JSON format—so say goodbye to being able to search that later, too!

About that compression...

That compression proved to be important due to yet another shortcoming of DynamoDB, one that nearly made me pull my hair out and encourage the team to switch back to MongoDB. It turns out the maximum record size in DynamoDB is 64K. That’s not much, and it takes me back to the days of 16-bit Windows where the text field GUI element could only hold a maximum of 64K. That was also, um, twenty years ago.

Which is a limit that, say, InnoDB in MySQL also has. So, let's tally it up:

  • There's no way at all to search article text.
  • Comma-separated lists must be parsed to query by author name.
  • The same applies to keywords...
  • And categories...

So what the hell is this database for? It's unusable, unsearchable, and completely pointless. You have to know the title of the article you're interested in to query it! It sounds, honestly, like this is a case where the client didn't know what they needed. I really, really am hard-pressed to fathom a repository for scientific articles where they store the full text but only need to look up titles. With that kind of design, they could drop their internal DB and just use PubMed or Google Scholar... and get way better results!

I think the author and his team failed the customer in this case by providing them with an inflexible system. Either they forced the client into accepting these horrible limitations so they could play with new (and expensive!) toys, or the client just flat-out doesn't need this database for anything (in which case it's a waste of money.) This kind of data absolutely needs to be kept in a relational database to be useful.

Which, along with his horrible Java vs. C# comparison [slashdot.org] , makes Jeff Cogswell officially the Slashdot contributor with the worst analytical skills.

Re:Bad planning (3, Interesting)

mcmonkey (96054) | about 2 years ago | (#42971445)

Which, along with his horrible Java vs. C# comparison [slashdot.org] , makes Jeff Cogswell officially the Slashdot contributor with the worst analytical skills.

OK, that's what I thought. Well, first, for anyone who hasn't read or doesn't remember that "Java vs. C#" thing, don't go back and read it now. Save your time, it's horrible.

Now, for the current article, isn't designing a database all about trade-offs? E.g. Indexes make it easier to find stuff, but then make extra work (updating indexes) when adding stuff. It's about balancing reading and writing, speed and maintenance, etc. And it seems like this guy has only thought about pulling out a single article to the exclusion of everything else.

Do we just not understand DynamoDB? How does this system pull all the articles by a certain author or with a certain keyword? What if they need to update an author's bio? With categories stored within the article object, how does he enforce integrity, so all "general relativity" articles end up with "general relativity" and not a mix of GR, Gen Rel, g relativity, etc?

What happens when they want to add full text search? Or pictures to articles? That 64k limit would seem like a deal breaker. 64k that includes EVERYTHING about an article--abstract, full text, authors and bios, etc.

My first thought was, this does not make much sense. Then I thought, well, I work with old skool RDMS, and I just don't get NoSQL. But now I think, naw, this guy really doesn't know enough to merit the level of attention his blatherings get on /.

Re:Bad planning (4, Interesting)

hawguy (1600213) | about 2 years ago | (#42971685)

That compression proved to be important due to yet another shortcoming of DynamoDB, one that nearly made me pull my hair out and encourage the team to switch back to MongoDB. It turns out the maximum record size in DynamoDB is 64K. That’s not much, and it takes me back to the days of 16-bit Windows where the text field GUI element could only hold a maximum of 64K. That was also, um, twenty years ago.

I didn't understand why he dismissed S3 to store his documents in the first place:

Amazon has their S3 storage, but that’s more suited to blob data—not ideal for documents

Why wouldn't an S3 blob be an ideal place to store a document of unknown size that you don't care about indexing? Later he says "In the DynamoDB record, simply store the identifier for the S3 object. That doesn’t sound like much fun, but it would be doable" -- is storing an S3 pointer worse than deploying a solution that will fail on the first document that exceeds 64KB, at which point he'll need to come up with a scheme to split large docs across multiple records? Especially when DynamoDB storage costs 10 times more than S3 storage ($1/GB/month vs $0.095/GB/month)

Re:Bad planning (1)

Anonymous Coward | about 2 years ago | (#42972043)

The AWS platform and the ease of scaling it offers. The application can actually scale itself with their API. I know you can scale *sql horizontally, but you cant argue that its easier.

Fom TFA:
"Our client said they didn't need a full-text search on the text or abstract of the documents; they only cared about people searching keywords and categories. That’s fine—we could always add further search capabilities later on, using third-party indexing and searching tools such as Apache Lucene.
slashdot (http://s.tt/1A3VL)"

Consider a typical website that needs text search. Would you implement text search yourself with your nicely normalized database? or do you just denormalize the data and store it in a database specialized maintained and developed for years, like Apache SOLR, or Lucene like he mentions? My point is its quite common to duplicate your data across multiple specialized db backends. This is easier with the NoSQL paradigm because you don't need to normalize your data. Concurrency is the price you pay. For an application centered around scientific articles, concurrency understandably isn't a priority.

Re:Bad planning (1)

Samantha Wright (1324923) | about 2 years ago | (#42975615)

What functionality is DynamoDB providing in this context that Lucene wouldn't? And what the hell is the client going to do with the database before Lucene is put into place?

Re:Bad planning (1)

David Off (101038) | about 2 years ago | (#42972193)

Interesting analysis.

I've been messing around writing my own Java NoSQL CMS called Magneato. It stores articles in XML because I use XForms for the front end (maybe a bad choice but there isn't a good forms solution yet, not even with HTML5) and I use Lucene/Bobo for the navigation and search side of things. It is focussed on facetted navigation although you can have relations between articles: parent of, sibling etc via Lucene.

It actually sounds like my efforts are better than this team have produced.

Re:Bad planning (0)

Anonymous Coward | about 2 years ago | (#42973429)

Which, along with his horrible Java vs. C# comparison [slashdot.org] , makes Jeff Cogswell officially the Slashdot contributor with the worst analytical skills.

Yes, now that Jon Katz has disappeard.

Re:Bad planning (1)

gargleblast (683147) | about 2 years ago | (#42975819)

No no. Jeff Cogswell first man ever whip MongoDB. MongoDB impressed.

This article is garbage (0)

JDG1980 (2438906) | about 2 years ago | (#42971223)

TL;DR: Jeff Cogswell doesn't understand how relational databases work. Or "the cloud", for that matter.

Re:This article is garbage (0, Offtopic)

Anonymous Coward | about 2 years ago | (#42971371)

So Slashdot BI has value. It helps identify authors who don't know what they're talking about.

Ironically, I just came to the opposite conclusion (0)

Anonymous Coward | about 2 years ago | (#42971237)

http://travispbrown.com/post/43167533260/a-tale-of-two-databases-dynamodb-and-mongodb

Bad Choice (0)

Anonymous Coward | about 2 years ago | (#42971327)

Mongo has more punch [youtube.com]

My migration path (5, Funny)

Lieutenant_Dan (583843) | about 2 years ago | (#42971383)

We decided that MongoDB was adequate but didn't leverage the synergies we were trying to harvest from our development methodologies.

We looked at GumboDB and found it was lacking in visualization tools to create a warehouse for our data that would provide a real-time dashboard of the operational metrics we were seeking.

Next up was SuperDuperDB which was great from a client-server-man-in-the-middle perspective but required a complex LDAP authentication matrix that reticulated splines within our identity management roadmap.

After that I quit. I hear they are using Access 95 with VBA.

Re:My migration path (2)

mcmonkey (96054) | about 2 years ago | (#42971843)

After that I quit. I hear they are using Access 95 with VBA.

I think you're trying to be funny (or at least sarcastic) but the last time I worked on a system that stored multiple values in a field as delimted string--as this guy proposes storing mutiple authors and keywords--was for a late 90s dotcom running a web site off of an Access 97 mdb.

Re:My migration path (0)

Anonymous Coward | about 2 years ago | (#42972197)

I think you're trying to be funny (or at least sarcastic) but the last time I worked on a system that stored multiple values in a field as delimted string--as this guy proposes storing mutiple authors and keywords--was for a late 90s dotcom running a web site off of an Access 97 mdb.

We used to do that but we found the locking issues on an Access 97 database to be completely unacceptable and the install dependencies to limiting in the cloud we've moved our tree house to. Since then we've implemented a new raw file based that doesn't have the nasty performance overhead of locking systems and integrity and is really simple to implement

System.IO.File.ReadAllLines(SettingsConfig.Instance.DatabaseFile).Contains("search query")

We've got it implemented in about every platform including XSLT but those are only for paying users of our "pro" product.

Ironically, I came to the opposite conclusion (2)

Travis Brown (2847633) | about 2 years ago | (#42971395)

Re:Ironically, I came to the opposite conclusion (1)

Anonymous Coward | about 2 years ago | (#42971405)

Where's the irony exactly?

Re:Ironically, I came to the opposite conclusion (1)

Travis Brown (2847633) | about 2 years ago | (#42971551)

I guess it's ironic that we both recently posted articles that documented our thought process on choosing a NoSQL database, but coming to an opposite conclusion. Apologies, maybe it was just coincidental?

Re:Ironically, I came to the opposite conclusion (1)

Anonymous Coward | about 2 years ago | (#42971813)

Yes, there is nothing ironic about that at all. Even when applying the retarded definition of irony popularized by Alanis Morissette.

Re:Ironically, I came to the opposite conclusion (0)

Anonymous Coward | about 2 years ago | (#42973003)

Travis's expectation is that he made the "right" choice, and one of the products is better. This article postulates otherwise, which is against his expectations. That, my friend, is called irony. A clash between expectations and reality. Would you like some more help understanding this?

Re:Ironically, I came to the opposite conclusion (0)

Anonymous Coward | about 2 years ago | (#42973843)

That's not irony.

Re:Ironically, I came to the opposite conclusion (0)

Anonymous Coward | about 2 years ago | (#42973995)

Merriam Webster [merriam-webster.com] , especially (3a):

incongruity between the actual result of a sequence of events and the normal or expected result

If you truly were not aware of the definition of irony, let me suggest the irony of your attempt to correct other's use of the word.

*You might argue that it was not ironic to you because you had no such expectations. But, you would still be unable to argue that it was not ironic to Travis; his use of the word remains correct.

**captcha: commando

Re:Ironically, I came to the opposite conclusion (0)

Anonymous Coward | about 2 years ago | (#42975397)

Whatever, Travis. Stop defending yourself as AC. It's pretty lame.

Re:Ironically, I came to the opposite conclusion (1)

sexconker (1179573) | about 2 years ago | (#42971597)

Where's the irony exactly?

Unless Travis Brown ejaculated while reaching his conclusion, there is none.
http://www.youtube.com/watch?v=WY_amJ0YZrM [youtube.com]

Re:Ironically, I came to the opposite conclusion (4, Funny)

Travis Brown (2847633) | about 2 years ago | (#42971627)

I'm sorry that I did not use the word "Ironically" correctly. You win internet.

Re:Ironically, I came to the opposite conclusion (1)

Jeng (926980) | about 2 years ago | (#42971739)

Did you submit it as an article here?

If not please do.

Re:Ironically, I came to the opposite conclusion (1)

Travis Brown (2847633) | about 2 years ago | (#42971903)

Thanks! I just submitted it. My first submission to slashdot.

Re:Ironically, I came to the opposite conclusion (1)

Jeng (926980) | about 2 years ago | (#42973181)

I gave it a bump in the firehose, but who knows if it will make it to the main page.

Related: Why waffle makers are better than puppies (0)

Anonymous Coward | about 2 years ago | (#42971729)

We decided we wanted something comforting, so naturally we chose a waffle maker.

Is it just me, or is anyone else tired of seeing authors trying to pass off stuff like that as reasoning?

you FAIL It (-1)

Anonymous Coward | about 2 years ago | (#42971815)

Or a publ1c club, Over thE same

What? (0)

Anonymous Coward | about 2 years ago | (#42972411)

This guy is seriously throwing all his data into one comma delimited field? What's the database for again?

Re:What? (1)

Desler (1608317) | about 2 years ago | (#42972597)

His solution becomes web scale.

Cogswell has the analytical skills of a wet noodle (0)

Anonymous Coward | about 2 years ago | (#42972637)

Why does slashdot keep giving him exposure?

The crux of the entire article... (3, Insightful)

CHK6 (583097) | about 2 years ago | (#42972661)

Their budget was limited.

The same sentiment is echoed multiple times in the article. So this really isn't about how a development team choose DynamoDB over MongoDB. But rather the financial limitations of the client mandated the development team to use DynamoDB. In fact the article is more in favor for using MongoDB over DynamoDB, but the client's requirements forced their hands into using an alternative that was not as favorable for development.

Re:The crux of the entire article... (0)

Anonymous Coward | about 2 years ago | (#42974765)

MongoDB is free, if you dont care about "vendor" support. There's certainly a big enough community around MongoDB where 99% of your problems can be answered by simply googling it. Fail on the authors part.

Reinventing the wheel (0)

Anonymous Coward | about 2 years ago | (#42974619)

Re: "at times it felt like going back two decades in technology by writing indexes ourselves"

More like double that, to four decades. Custom written index maintenance code? Really!? This is no kind of positive recommendation for DynamoDB, more like an indictment of it.

so what he is saying is (0)

Anonymous Coward | about 2 years ago | (#42977127)

So what he is saying is: the tools aren't mature, you have to re-invent the wheel with them, the wheel they invented is ok, but you will have to invent your own wheel. Somehow, that's all good though, you should try to follow what we did instead of using mature tools that already exist to build web infrastructure. We have religion about some of the products we use, and hope you will pick some of the tools we used for the same irrational reasons we used. It will take more time, cost more and won't get you any further ahead, but you might feel warm and fuzzy inside afterwards. On the other hand, you might not get as far as us, there is no code sharing so you are all on your own, and the time delays and extra costs might just kill your business/idea. This is news for nerds, just not good news for nerds. More like a cautionary tale.

I just sneezed into my punch cards (2)

adnonsense (826530) | about 2 years ago | (#42978811)

FTFA:

We weren't thrilled about this, because writing your own indexes can be problematic. Any time we stored a document, we would have to update the index. That's fine, except if anything goes wrong in between those two steps, the index would be wrong. However, the coding wouldnâ(TM)t be too terribly difficult, and so we decided this wouldn't be a showstopper. But just to be sure, we would need to follow best practices, and include code that periodically rebuilds the indexes.

Hello, I'm a time traveller from 1973 where I've been fondly imagining you folks in the future had written software to solve this kind of problem in a more generic fashion. Back in the past we have some visionary guy by the name of Codd, and in my wilder dreams I sometimes imagine by the year 2000 someone has created some kind of revolutionary database software which is based on his "SEQUEL" ideas and does fancy stuff like maintaining its own indexes.

Then I wake up and realise it was just a flight of fantasy.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?