ruk·si

🛤️ Ruby on Rails
Queries

Updated at 2015-09-23 01:55

Related to Ruby Guide and Ruby on Rails Guide.

Use explain to debug queries.

User.where(id: 1).includes(:articles).explain

You can debug database through Rails console.

# bundle exec rails console
# or to Heroku: heroku run console -a <HEROKU_APP>
ActiveRecord::Base.connection.tables
ActiveRecord::Base.connection.execute("SELECT * FROM libraries").to_a
Model.find(name: "Hello").pluck(:id)
Model.find(1).other_related_model.map{ |x| x.name }
User.where(:email => "me@ruk.si").first.update_attribute(:name, "Ruksi")

Use existence checks if you don't need the data.

Client.exists?(1) # does client with id 1 exist?

Don't use string interpolation in queries. It will make your code susceptible to SQL injection attacks.

# bad - param will be interpolated unescaped
Client.where("orders_count = #{params[:orders]}")

# good - param will be properly escaped
Client.where('orders_count = ?', params[:orders])

Use named placeholders if a query has more than one variable.

# bad
Client.where(
  "created_at >= ? AND created_at <= ?",
  params[:start_date], params[:end_date]
)

# good
Client.where(
  "created_at >= :start_date AND created_at <= :end_date",
  start_date: params[:start_date],
  end_date: params[:end_date]
)

Don't use where if you are going to fetch only one record by id. Use find.

# just silly
User.where(id: id).take

# good, raises ActiveRecord::RecordNotFound if not found
User.find(id)
User.find([1, 10])

Prefer find_by to where when searching for a single record.

# bad
User.where(first_name: 'Bruce', last_name: 'Wayne').first

# good
User.find_by(first_name: 'Bruce', last_name: 'Wayne')

Use find_each to iterate over a collection of ActiveRecord objects. Looping through a collection of records from the database (using the all method, for example) is very inefficient since it will try to instantiate all the objects at once. In that case, batch processing methods allow you to work with the records in batches, thereby greatly reducing memory consumption.

# bad
Person.all.each do |person|
  person.do_awesome_stuff
end

Person.where("age > 21").each do |person|
  person.party_all_night!
end

# good
Person.find_each do |person|
  person.do_awesome_stuff
end

Person.where("age > 21").find_each do |person|
  person.party_all_night!
end

Prefer the use of where.not to SQL.

# bad
User.where("id != ?", id)

# good
User.where.not(id: id)

Use ordering.

User.order(:created_at)
User.order(created_at: :desc)
User.order(orders_count: :asc, created_at: :desc)

Use limits.

User.limit(5)

Use grouping.

Order.select(
  "date(created_at) as ordered_date, sum(price) as total_price"
).group("date(created_at)")

Order.group(:status).count

Use null relation when you know you won't get any results. Maintains the default interface like chaining but returns zero responses.

Article.none

Use read-only responses if you know none of the results will be modified. Trying to modify read-only result raises ActiveRecord::ReadOnlyRecord exception.

Client.readonly.first

Avoid raw SQL queries. Usually you are doing something wrong if you find yourself doing this as active record queries have very extensive customization options. More about raw queries in raw queries snippet

master_keys = Key
 .select(:id)
 .joins("LEFT JOIN lock ON (lock.key_id = key.id AND lock.target IS NOT NULL)")
 .order("COUNT(lock.id) ASC, RANDOM()")
 .group("key.id")
 .limit(5)

Batch insert is a good use-case for raw SQL. Using normal saves for 100 or more inserts has really bad performance because each insert is done their own transaction and as separate queries.

created = Time.current
inserts = Key.find_each do |key|
    "(#{lock.id}, '#{created}', '#{created}')"
end
sql = ""
sql << "INSERT INTO lock (lock_id, created_at, updated_at) "
sql << "VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute sql

Define pluck to hash if working with a large set of records. Pluck returns values as an array but you can define your own variant that keeps they column names as keys. select does the same but unfortunately has a big performance penalty for fetches over than 50 rows as it still instantiates active record for each result.

# in user.rb
def self.pluck_to_hash(keys)
  pluck(*keys).map{ |pa| Hash[*keys.zip(pa).flatten] }
end

# in controller
User.limit(:10).pluck_to_hash(['id, name, email, created_at'])

Other important methods.

User.take  # returns 1 user without any filtering or order, nil if not found
User.first # returns first user ordered by primary key
User.last  # returns last user ordered by primary key
User.pluck(:first_name) # returns only an array of first names

Use heredocs with squish on complex SQLs. squish removes the indentation and newline characters so log messages become readable.

User.find_by_sql(<<SQL.squish
  SELECT
    users.id,
    accounts.plan
  FROM users
    INNER JOIN accounts
      ON accounts.user_id = users.id
SQL)

Joins

# if your ActiveRecords are configured to have relation...
Category.joins(:articles) # categories and related articles
Article.joins(:category, :comments) # articles that have category and comment
Article.joins(comments: :guest) # articles that have a comment made by a guest

Use eager loading when processing a lot of relations. For example, if you want to display a list of posts and for each of these posts the last three comments.

users = User.limit(10)
users.each do |u|
  puts u.address.postcode
end
# => does 11 queries

users = User.includes(:address).limit(10)
users.each do |u|
  puts u.address.postcode
end
# => does 2 queries

Scopes

Use named scopes. Scoping allows you to specify commonly-used queries which can be referenced as method calls.

class User < ActiveRecord::Base
  default_scope { where("removed_at IS NULL") }
  scope :active, -> { where(active: true) }
  scope :inactive, -> { where(active: false) }
  scope :with_orders, -> { joins(:orders).select("distinct(users.id)") }
  scope :created_before, -> (time) { where("created_at < ?", time)
end

User.all                                    # => [not removed users]
User.active                                 # => [active users]
User.created_before(Time.zone.now)          # => [users created before now]
User.active.created_before(Time.zone.now)   # they can be chained
User.unscoped {
  User.all                                  # => [all users, even removed]
}

Use unscope and only. You should keep your default scopes as usable as possible, but sometimes you need a slightly different scope only once in an app. These methods are for that purpose.

Article.where('id > 10').limit(20).order('id asc').unscope(:order)
# removes ordering
Article.where('id > 10').limit(20).order('id desc').only(:order, :where)
# only filters and orders the result

Avoid named scopes with complex lambdas. When a named scope defined with a lambda and parameters becomes complicated, it is preferable to make a class method instead which serves the same purpose of the named scope and returns an ActiveRecord::Relation object.

class User < ActiveRecord::Base
  def self.with_orders
    joins(:orders).select("distinct(users.id)")
  end
end

Note that class method scopes are not always chainable.

# unchainable
class User < ActiveRecord::Base
  def User.old
    where("age > ?", 80)
  end

  def User.heavy
    where("weight > ?", 200)
  end
end

# chainable
class User < ActiveRecord::Base
  scope :old, -> { where("age > 60") }
  scope :heavy, -> { where("weight > 200") }
end

Sources