
Blog
Alumni Profiles Series: Yingbo Li
Yingbo Li received her B.S. in statistics from Peking University and her Ph.D. in statistics from Duke University, where she also completed the Certificate in College Teaching. She is currently a director-level research data scientist at Capital One. Early in her career, Dr. Li worked at Clemson University as an assistant professor in statistics, followed by a position at Southern Methodist University as a visiting assistant professor for 10 months before joining Capital One. In her current role, Dr. Li practices statistics and machine learning modeling (including Bayesian statistics!) to create a variety of models to drive business values while providing teaching and consulting services to the data science communities within the company.
What has your career path looked like since you graduated?
I received my Ph.D. in 2013. Then I got a faculty job, a tenure-track assistant professor position at Clemson University, which is one of the top two public schools in South Carolina, not too far from Duke. It was a math department with about 60 faculty members. There were very good researchers there and it was a very supportive environment for junior faculty. I enjoyed doing research at Clemson.
Then we made a family decision to move to Dallas, Texas. First, I accepted a visiting position at Southern Methodist University (SMU) for one year. During that time, I applied for faculty positions both at SMU and University of Texas-Dallas, but neither gave me an offer. Luckily, I was able to get a job at Capital One and I’m happily working there.
Did you have a clear idea of the kind of industry job you wanted when you transitioned from academia?
This is a constrained optimization problem. I have a lot of considerations about location: I could not try Facebook or Google because they didn’t allow remote working, and I didn’t want to move away from my family. I could only look at the local market. Actually, Dallas is a pretty good place, a great city for having a very diverse profile of companies, headquarters, and jobs. For example, there are PepsiCo and Lay’s, American and Southwest Airlines, and regional headquarters of almost all major banks!
A lot of time, it depends on luck—what’s available at the time, what type of openings they have, what type of people they are looking at, or whether there’s a fit for your profile. A lot of randomness. Basically, my strategy is to make good use of alumni networks. At that time, Thomas Leininger, who graduated in 2014, was working at Capital One. We had casually met before I started to look for jobs. When I moved to Dallas, I knew he was there, so we reconnected. Thomas helped me a lot.
Can you choose not to manage people in your role?
Capital One is different from a lot of other companies. It has the reputation of a tech company in the banking industry. Among the data science job family, individual contributors start to have the path to be advanced to higher levels (than before) without taking people management responsibilities. There are manager-level or even director-level individual contributors who bring a great amount of value to the company via their technical expertise. For myself, my previous background and experiences in academic teaching and research enable me to create and provide a lot of statistics/machine learning training internally, along with conducting applied R&D work and turning that into real business value.

How do you use your Ph.D. training in your day-to-day work? In particular, is Bayesian statistics useful?
It helps a lot. For example, I use hierarchical time series models in my job. Having a solid knowledge of Bayesian statistics and knowing how to use it correctly is critical.
Whether Bayesian statistics is useful depends on the application. A method can be suitable for certain problems. In my academic research at Clemson, we developed a Bayesian change point detection method to detect change points in the temperature series. Later when I joined Capital One, my first project was to extend that method to make it a more generic change point detection method that I can apply to a lot of time series in the company.
Luckily a lot of my projects are kind of Bayesian-related (at least I sense that’s suitable). And if you can propose a practical solution that is comparable with a non-Bayesian solution while you can show it has an advantage, why not? I don't think people care about whether your solution is special. They only care about if it's good, stable, and easy to use.
Meanwhile, how we approach the problem does not necessarily make a difference for deployment. A Bayesian version of regression is in exactly the same form as a regular regression in the prediction stage. Nothing would be more complicated so people are happy as long as the model’s performance translates to dollar value.
Do you have any suggestions for students who are interested in a career in data science?
Get internships. Get maybe a few different internships in different types of companies or different areas so you can see a broader picture. I know some faculty don't like their students to get an internship because they perceive it as taking away time from research, so you do need to discuss this with your advisor in advance. If the advisor is okay with it, then knowing the industry through first-hand experience is a good thing. There are companies that take internships very seriously. Capital One, being one of them, wants to make sure interns have a good experience and the company knows the interns well because return offers will be given according to their performance. We spend a lot of effort in intern programs. A good internship experience, though it may be challenging, should make you feel supported and help you understand the business and the area better.
What is your favorite memory of Duke?
The department Halloween party was very fun. All the professors were in a very light mood and a lot of people even made their own costumes.
Also, I really miss that every year Professor David Banks invited all Ph.D. students to his house to have Thanksgiving dinner with his family. A lot of international students were there. That was always a great time.
Author

Carol Wang, Ph.D.
Recent Ph.D graduate., Statistical Science
Carol Wang is a recent Ph.D. graduate from the Department of Statistical Science at Duke University, currently working as a data scientist at Meta. Her research specializes in tree-based generative models, encompassing both parametric and non-parametric models, with applications to microbiome compositional data.