Discover Top Posts Tagged with #ampcloud

I propose a revised version of the famous proverb: "The way to a man's heart is through his startup."

Speeding up Mongoose queries by requesting only the fields you need

I'm currently building a startup ([ampcloud](http://ampcloud.fm)) with [Node.js](http://nodejs.org), [MongoDB](http://mongodb.org), [Mongoose](http://mongoosejs.com/), and a handful of other tools. After spending quite a few years in the Django world, it's been fun doing a mental context switch into the land of JavaScript, callbacks, and closures. Occasionally I've run into some gotchas, and this particular one is a great example. Let's say you're building a blog, and part of your database schema looks something like this: var CommentSchema = new Schema({ title: {type: String}, body: {type: String}, createdAt: {type: Date} }); var PostSchema = new Schema({ author: {type: String}, title: {type: String}, createdAt: {type: Date}, slug: {type: String}, comments: [CommentSchema] }); module.exports.Post = mongoose.model('Post', PostSchema); Every post is stored as a separate document in MongoDB, but all comments are embedded within it. This means that when you fetch a post, you'll get all the comments back with it. Now let's say you want to display a list of the 20 most recent blog posts on your home page. Assuming you're using [Express](http://expressjs.com/), you would write a view like: app.get('/', function(req, res) { Post .find() .asc('createdAt') .limit(20) .run(function(err, posts) { if (err) { res.render('error', {status: 500}); } else { res.render('allposts', {posts: posts}); } }); }); You'd also want to add an index to allow efficient querying by date created: PostSchema.index({createdAt: 1}); Your blog will probably work well at first, but you'll run into problems as soon as one of your amazing posts goes viral and gets thousands of comments. You'll notice that your main page starts taking a lot longer to load. Even when you're the only one browsing your blog, it just won't feel as snappy anymore. #### Beware: Mongoose fetches all fields by default The culprit is the **comments** field. Because a Mongoose query requests all fields of a document by default, every site visitor will cause it to request and parse the entire list of comments. Every time. You don't even need the list of comments to render the main page. Let's get rid of the **comments** field by adding the following line to the query chain: .exclude('comments') The final result: app.get('/', function(req, res) { Post .find() .asc('createdAt') .limit(20) .exclude('comments') .run(function(err, posts) { if (err) { res.render('error', {status: 500}); } else { res.render('allposts', {posts: posts}); } }); }); You'll find that this performs a lot better. The problem isn't so much that MongoDB can't return the data quickly enough. Rather, Node.js has to spend much of its time parsing extra JSON into JavaScript objects, which is both unnecessary and time-consuming.

Not surprisingly, I recently encountered this issue in production. I made the fix right at 3:00 GMT, and the load dropped dramatically. #### Takeaway: think about your queries When your models start accumulating lots of data, think about whether you can request a subset of fields when making queries. See the [Mongoose query documentation](http://mongoosejs.com/docs/query.html) for details. **Caveat**: Keep in mind that you won't gain much by excluding fields that store primitive types like Strings, Numbers, or Dates. Even worse, your code will probably get harder to read and maintain. Only make such optimizations when you have to. #### Some final notes The above schema suffers from a fundamental flaw: it doesn't scale well. If a blog post gets thousands of comments, you'll probably want to paginate the comments and only show several hundred at a time. But with this schema, you can't ask MongoDB for a subset of comments. You can only get all or nothing. To make this production ready, you'd probably want to separate Comment and Post into separate Mongoose models, instead of nesting Comments within Posts as embedded documents. Each Comment would be a separate MongoDB document, you'd store the Post id within the Comment, and you could efficiently query for random subsets of comments on a particular blog post.

#mongodb #mongoose #nodejs #tech #ampcloud

Battle of the Digital Jukebox

Turntable.fm and their slew of competitors showed us how voracious music lovers can be when they can come together to play tracks. The next craze: digital jukeboxing. We're seeing a handful of startups pop-up, all with a similar goal in mind: let patrons at bars choose what they're listening to on the house speakers via their own mobile phones and music libraries. The most common features so far are voting on tracks in the que, point and reward systems, and new approaches to music discovery. This is who Cantora has talked to so far

AmpCloud