define_gender.RdGiven a database table tbl, assigns the likely gender of the person
given the firstname. The firstname needs to be present as a column in tbl and passed
as argument firstname_left.
define_gender(tbl, conn, firstname_left, drop_missing)A query from conn with dbplyr and lazily evaluated.
An object of class SQLiteConnection to a sqlite database.
Column containing the firstname in table and to
be used for joining gender on.
If TRUE, drops records without clear gender assigned. Clear assignment is when probability of either gender is 0.8 or higher.
tbl augmented by a gender column.
The function uses the internal table FirstNamesGender, which
assigns the likely gender to each first name. The table is generated from
genderize.io.
firstname_left should be free of middle names and middle
initials, as otherwise the gender assignment fails (even though using only
the firstname would result in a high-confidence assignment.)
if (FALSE) {
new_table <- define_gender(
conn = conn, table = old_table,
firstname_left = "firstname_old", drop_missing = TRUE
)
}