Back to All Events

Symposium

Come spend your Saturday at the largest DSI event of the year - our annual Symposium. 

Begin the day with coffee, remarks from DSI leadership and the UFII Director, and a keynote.

The symposium continues with speakers from a wide range of research fields at UF in three breakout sessions of four speakers each. Learn about computer vision, bioinformatics, political forecasting, business analytics, and more.

Come spend your Saturday at the largest DSI event of the year - our annual Symposium. 

Begin the day with coffee, remarks from DSI leadership and the UFII Director, and a keynote.

The symposium continues with speakers from a wide range of research fields at UF in three breakout sessions of four speakers each. Learn about computer vision, bioinformatics, political forecasting, business analytics, and more.

Our symposium will also include two rounds of workshops with several choices in each round- so you can brush up on your Python, learn about data visualization, or deepen your knowledge of machine learning.

This is a fantastic opportunity to network with students and faculty who are passionate about the impact of data science and the tools they utilize to realize that impact. 

Coffee and Lunch will be served. 

If you plan to attend, please RSVP through this form: 

https://goo.gl/forms/5jL1Ur3PtGYO2PBc2

Sign up by February 24th and be in the first 100 people to sign up to get a free DSI shirt!

While we urge you to RSVP for food estimates, we will not turn anyone away, so feel free to bring a friend! 


The schedule is below: 

10:30 - 11:00  Registration & Coffee in Grand Ballroom
11:00 - 11:15  Bobbie Isaly - What is DSI?
11:15 - 11:30  Dr. George Michailidis - What is the UFII?
11:30 - 12:30  Keynote Speaker - Dr Manuel Bermúdez
12:30 - 1:20  Networking Lunch
1:30 - 2:00  Breakout Session 1 (20 minute presentations, 10 minute Q&A)
2:10 - 2:40  Breakout Session 2 (20 minute presentations, 10 minute Q&A)
2:50 - 3:20  Breakout Session 3 (20 minute presentations, 10 minute Q&A)
3:30 - 4:10  Workshop Session 1 (4 workshops)
4:20 - 5:00  Workshop Session 2 (4 workshops)
5:05 - 5:10  Closing Remarks and How to Get Involved

Breakout session 1:

Room 1

Smart /Green Manufacturing:  Data Enabled Decision Making and Optimization Applications

Panos M. Pardalos, ISE Department

Center for Applied Optimization, University of Florida

http://www.ise.ufl.edu/pardalos

 

Smart manufacturing (Industry 4.0) is the fourth industrial revolution. With advances in information and telecommunication technologies and data enabled decision making, smart manufacturing can be an essential component of sustainable development.

 

We are going to discuss some successes and focus on data enabled decision making and optimization applications. In addition, we will discuss future research directions and new challenges to society.

 

 

Room 2

Behavioral Finance, Data Science, and Sports: Umpires and MLB Totals Market Efficiency

Dr. Brian M Mills, Department of Tourism, Recreation and Sport Management

 

Sports betting markets have been used extensively in understanding market efficiency and behavioral biases, and have played a public role in generating interest in data science and analytics. We use this setting to test the propensity for the MLB totals market to integrate information about umpire home plate assignments, which are only known to the public for certain games. We first use generalized additive models to estimate the strike zone surface in MLB using data on individual pitch location for 2.5 million called pitches from 2008 through 2014. From these models, we aggregate error terms at the individual umpire level as a measure of favorability toward offense or defense, and insert this measure into least squares regressions to identify effects of umpire behavior on actual run totals. We then identify whether totals lines adjust upon release of information about umpire assignments to the public for certain games. Our regressions show that while the market adjusts slightly to umpire assignments, it does not adjust fully, and there are opportunities for sharp bettors to take advantage of this information. We exhibit a simple betting strategy using this granular umpire decision data that returns nearly 10% per bet.

 

Room 3
Brief Overview of Statistical Designs

Matthew Robinson, Department of Biostatistics


A brief overview of common statistical designs, sample size and power analysis, data formatting, and restricted randomization. We talk about statistical ways to compare groups and measure associations between variables while avoiding common pitfalls such as confounding. Additional related topics of sample size considerations, power analysis, data formatting, and methods of randomization will also be covered.

 

Room 4

Concept Drift Detection: the State-of-the-Art
Shujian Yu, Computational NeuroEngineering Laboratory


In a streaming environment, there is often a need for statistical prediction models to detect and adapt to concept drifts (i.e., changes in the joint distribution between predictor and response variables) so as to mitigate deteriorating predictive performance over time. Various concept drift detection approaches have been proposed in the past decades. However, they do not perform well across different data stream distributions and rely heavily on the availability of true labels. This talk presents a novel framework that can detect and also adapt to the various concept drift types, even in the scenario of expensive labels. The framework leverages a hierarchical set of hypothesis tests in an online fashion to detect concept drifts and employs an adaptive training strategy to significantly boost its adaptation capability. A Request-and-Reverify strategy is further incorporated to significantly reduce the requirement of true labels. The performance of the proposed framework is compared to benchmark approaches using both simulated and real-world datasets spanning the breadth of concept drift types. The proposed approach significantly outperforms benchmark solutions in terms of precision, delay of detection, the adaptability across different concepts as well as the number of required true labels.
 

 

Breakout session 2:

Room 1

Title: Visualizing Student Success Using a Sankey Diagram in Tableau
Tim Young, CLAS

Assistant Director for Data Management and Analysis


Student success is often only considered with a metric like retention or the four and six year graduation rates of a cohort.  Simple statistics like these often masks subtle changes that happen at different points along a student’s academic career. I will demonstrate the ability to explore cohorts with a Sankey diagram (a.k.a. ribbon diagram) using Tableau software.  I will also demonstrate how this visualization tool can be used for other purposes like exploring student success in course sequences.

Earlier Event: March 15
Elections: General Body Meeting #3
Later Event: March 19
Data Visualization