[UCR_DataScience] REMINDER: Data Science talk by Prof. Samet Oymak, tomorrow Friday May 6th, 12-1pm, MRB Seminar Room

Thu May 5 08:01:49 PDT 2022

Reminder about the Data Science Seminar tomorrow Friday at noon.
Please register using the link below if you plan to attend.

V. Tsotras

-------------------------------------------------------------------
> The next Data Science Seminar will be this coming Friday May 6th, 2022,
> from 12:00-1:00pm  at the MRB Seminar Room (1st floor).
>
> **** Pizza and refreshments will be provided ****
>
> To keep track of the number of attendees, please *register* at:
> https://www.eventbrite.com/e/data-science-talk-tickets-331275533037
>
> The talk will be given by Prof. Samet Oymak, Department of Electrical and
> Computer Engineering, UCR
>
> Title: Understanding Large ML Models through the Structure of Feature
> Covariance
>
> Abstract:
>
> An overarching goal in machine learning is to enable accurate statistical
> inference in the setting where the sample size is less than the number of
> parameters. This overparameterized setting is particularly common in deep
> learning where it is typical to train large neural nets with relatively
> smaller sample sizes and little concern of overfitting. In this talk, we
> highlight how structure within data is a catalyst for the empirical
> success of these large models. After linking deep nets to linear models,
> we show that the eigen-structure of the feature covariance can help
> explain empirical phenomena such as noise robustness, double descent
> curve, model compression, and the benefits of perfectly-fitting to the
> training data. In particular, we highlight that a typical feature
> covariance has a spiked structure with few large eigenvalues and many
> smaller ones. We proceed to discuss: (1) For data with label noise:
> Regularization is useful to restrict the optimization process to large
> eigen-directions and reduce overfitting, and (2) For (mostly) noiseless
> data: Incorporating small eigen-directions is crucial for striking a good
> bias/variance tradeoff. This in turn explains why larger models work
> better despite perfect-fitting with no regularization. Finally, we explain
> how our high-dimensional analysis framework based on gaussian process
> theory facilitates these findings.
>
>
>
> ------------------------------------
> Sponsored by the UCR Data Science Center, the purpose of the Data Science
> talks is to foster collaborations between "core" Data  Science faculty
> (from CSE/ECE/Stat Departments) and faculty/visitors from other sciences
> that face Data Science problems in their research. These informal
> gatherings are open to interested faculty and graduate students. Each
> meeting will start with a talk describing research problems and then a
> discussion will follow for questions, open problems, ideas for possible
> collaborations etc.
>
> A full list of previous seminars appears at:
> http://datascience.ucr.edu/news
>
> Please forward this email to other colleagues or graduate students in your
> lab that may be interested.
>
> Moreover, if you are interested in giving a Data Science related talk,
> please contact me.
>
> Sincerely,
> Vassilis Tsotras
> Professor, CSE Department
> Director, Data Science Major
>
>
>
>
>
>
>
>
> _______________________________________________
> DataScience mailing list
> DataScience at lists.ucr.edu
> https://lists.ucr.edu/mailman/listinfo/datascience