Assigned to update Lewa's lion population records and produce a multi-generational family tree. Updated the Excel dataset tracking births, deaths, and lineage — then used RStudio to generate a visualised family tree from the cleaned data.
Why this data matters
Individual lion identification and genealogical tracking is a core part of wildlife management at Lewa. Knowing which females are breeding, which lineages are surviving, and how pride structure changes over time allows the research team to assess population health, detect early signs of genetic stress, and understand how predators interact with prey populations across the landscape.
The dataset I was assigned tracks the female lions and cubs they have produced across Lewa, recording births, deaths, paternity, and cub survival across multiple generations. The records needed updating with the most recent data, and the team wanted a visualised family tree to make the generational structure legible at a glance.
The Work
The existing spreadsheet tracked lionesses and their cubs across multiple generations — but needed updating with the most recent births, deaths, cub names, paternity records, and survival statuses. I went through the dataset systematically, cross-referencing field records to fill in missing entries, correct inconsistencies, and add new data for the most recent breeding seasons (2023–2024).
Using the updated Excel dataset as the source, I built a family tree visualisation in RStudio. This involved structuring the parent-offspring relationships as a graph, mapping generational levels, and rendering the lineage in a readable format that the research team could use directly in their reporting. The RStudio output is shown below.
Family Tree
Each node represents an individual lion. Hover for details. Colour indicates survival status. Lines show mother-offspring relationships. Key males — Muffasa, Ntulele, Jacob, Dick, Cat-tail — appear as paternity references.
Summary Statistics
RStudio Output
This is the family tree generated directly from RStudio using the updated dataset — the actual deliverable produced during the placement for the research team.