Submitted by FresckleFart19 t3_z2hr4c in MachineLearning

I have uploaded two repositories on github - the code was personal so it's pretty much undocumented but due to personal issues I currently can't work on them and maybe the ideas here will inspire someone.

The main ideas are:

  1. Seeing categories as ensembles of ml models with more complex sturcture than X->(Y1,Y2,...) and using commutative diagrams as optimizations objectives with equality of morpisms (=models) replaces with some loss/objective function.

https://github.com/BeNikis/Category-Theoretic-Model-Ensembles

  1. Using language models and some formal language for describing categories,automating the above work when we have some base category with some 2nd level (for example in a category with only tensors we could have two objects of different patches of an image of the same size,the shape of those patches can be '2nd level' types of the objects and we could apply any morphism that takes in the type of that object) we could automatically find pathways (compositions of models) that do what we want.or,if the category we;re working is for example is Hask,Haskell types and programs,this could be used in automated programming.

https://github.com/BeNikis/Manipulating-Categories-With-ML

3)I have this very general concept of a agent environment adjunction - an adjunction in category theory is a very loose but deep relationship between two categories,basically 'an isomorphism up to a specified morphism' . in the agent-environemnt case,the agen percieving the environmet is the forgetful functor (in reference to the mane free-forgetful adjunctions) because we unavoidably lose some information when we percieve with limited sensors,and inferring the overall state of the environment from the agents known information would be the free functor. Now,combining this with the above two ideas,the two categories could be the categories of categories of states of the environemnt and for the agent ml model ensembles,the adjunction itself could be seen as an optimization objective (the information from the sensors of the agent are injected into the category by the DataMorphism class in the first repo),and we could build better and better agent states by building up that categories with (co)limits,which again are fuzzified with some yet unknown unsupervised obejctive.

This idea is similar to what is already happening in both ML and CT - on the ML side we have autoencoders and diffusion models which go from environment->'agent' (some intemediary code)->back to environemnt,and in CT for example we have this paper on a syntax-semantics view of language models,which rings bells with similarities with the syntax-semantics adjunction in categorical logic:
https://arxiv.org/abs/2106.07890

I'm posting this due to personal stuff and because I'm currently on the edge of exhaustion working on this stuff,so maybe bringing these ideas up will not let them go to waste if they're valuable in the first place.

60

Comments

You must log in or register to comment.

einnmann t1_ixgxyrf wrote

Are you trying to describe optimization techniques using category theory or am I missing something?

If yes, then what's the point? You won't get anything out of renaming "mapping" with "functor".

11

FresckleFart19 OP t1_ixhc25h wrote

In one way it's more of a more systematic view of architectures than a simple renaming.with a framework with a mathematical theory behind it investigation into architectures and also it'relationship with data.simplest case is two morphisms in the supervised case m,d:x->y where m is the model and d is the 'hidden' relationship between input and desired output.

In another view,it could be seen as a look into NNs themselves,size the well known (in the CT crowd) adage that colimits are a way to put many things in various relationships with each other into a bigger thing is basically what perceptron does only with concrete numbers and objective functions instead of existence and uniqueness.I see a melding of fuzzy PAC learning and GOFAI structuralism as a way forward in AI,and this is a haphazard proposal of a way to do it.I choose categories because they seem way more flexible than any logic (which by the curry-howard-lambek correspondence you can model in a category with sufficient structure).I mean,you can model categories in categories and it's used in foundations of mathematics and it's kind of intuitive if your mind is the right type of broken :)

4

dpineo t1_ixhd0tq wrote

I'm bullish on CT in ML. I predict that in 10 years, that the word "functor" will be commonplace, similar to how the word "tensor" is now.

I also predict that we will misuse the word "functor" to mean "function", the way "tensor" is currently misused to mean "array", and mathematicians will continue to curse us for screwing up yet another one of their terms.

28

ktpr t1_ixhgc37 wrote

Thanks for sharing! Are you in a PhD program?

2

Matsarj t1_ixiaoj1 wrote

I'm someone with a PhD in a field that uses category theory extensively and I now work in DS. I'm finding a lot of ideas in this post unmotivated, I guess I'm pretty bearish on CT being applied to ML. Can you explain what problems you see that category theory will be used to solve?

13

yldedly t1_ixigekm wrote

I stumbled on this thesis some time ago, where the author formulates a category of causal model, where arrows are structure-preserving transformations between models. Seems like it would be useful for causal model discovery.

9

Matsarj t1_ixijut0 wrote

That looks interesting, and I'm definitely not saying there's no intersection between CT and DS. There's some cool things I've seen with CT and probability theory recently. But to me it often seems like theory in search of a problem, and a far cry from functor becoming a common word among ML practitioners.

3

LazyHater t1_ixipeej wrote

CT is kinda good for ML if you have a complex topology of solution spaces. When programmers try to implement categories from a naive view, instead of applying sophisticated categorical constraints on their models, I definitely feel a sort of way about it. With that said, LLMs with analytic modules should be able to do categorical constructions in the not too distant future, which will be nice as hell. Optimizing functors might be a thing someday too, but its definitely not there mathematically yet

Im bullish on deriving (co)homologies using ML but it will be some time before we get there i think.

2

DeStagiair t1_ixiq07v wrote

I, for one, can't wait until abstract nonsense takes over probability theory*.

ML/DS definitely has problems with papers sometimes playing fast and loose with the theory and instead focusing on getting +0.1% on your favorite leader board. "We make model bigger and big model make good predictions." In all seriousness, I think stuff like this is interesting, but I don't think it's very useful as of now. The fact that everything is just an array in Python has always irked me a bit, but maybe there can be some way to leverage type theory in some way to make the ML code more robust. Or perhaps CT can be used to make it easier to compose probabilistic models.

> * From Disintegration and Bayesian Inversion via String Diagrams

1

Matsarj t1_ixirp1l wrote

I guess I'm separating purely categorical applications from TDA applications, which I agree things like persistent homology will probably be useful.

2

LazyHater t1_ixiuzpu wrote

Persistent homology for real data is developing plenty of classical techniques, but we probably need a very good LLM with some very good HoTT to derive non-simplicial (co)homology for some given category -> some given abelian category

1

Matsarj t1_ixiwioj wrote

This sounds really interesting. Can you expand here or link to any resources related to this? I'm most interested in where you would apply these cohomology theories.

1

LazyHater t1_ixj3e2t wrote

Resources are quite scarce I'm afraid. Emily Riehl and company are working on (inf,1) categories to establish homotopy between derived functors, for applications in univalent foundations. For a computer algebra system or proof assistant, type equivalence is required to abstract away implementation details. To actually compute homotopy equivalence, it's better to compute cohomology equivalence, but simplicial cohomology is often too expensive to compute. So it's an open problem whether we can optimize a derived homology functor between a derived (enriched) functor and an abelian (enriched) category (which still lacks proper definition afaik). But its a goal I heard at a HoTT talk once to get non-simplicial cohomology of types instead of computing homotopy (which is computationally impossible at scale). Feel free to steal and spread this idea but it's kinda original and speculative.

tl;dr application is computing homotopy equivalence of types at a reasonable expense

3

dpineo t1_ixj7ot5 wrote

Sure. I see the potential of CT as being a language for expressing, reasoning about, and ultimately designing, AI/ML architectures abstractly.

In software development, we have the concept of "design patterns" that provide a common vocabulary with which we can describe common recurring patterns in software design at an abstract level. It cuts past the implementation details and allows us to focus on larger concerns, such as the composition and coupling of components, and the flow of information. This maturity in software development has allowed us to grow past brute forcing spagetti-code programs to developing robust enterprise-sized systems.

I believe that AI/ML is still in it's spagetti-code infancy. We have no idea how to build and compose AI/ML components into a system a disciplined way. To scale up to larger and more complex AI/ML systems, we're going to need to step back and look at AI/ML architectures more abstractly the way that software did with design patterns. I think CT may be able to help with that.

2

Cyrus13960 t1_ixjhcvl wrote

I'd take a week off from it, your unconscious will continue to process things and you'll be rested when you come back. Even if it's all wrong somehow it's unlikely that you won't get something out of this process as long as you put in more effort than others are putting into their comments in this thread.

1

Matsarj t1_ixjliwb wrote

So I'm pretty familiar with homotopy theory but don't know any type theory, homotopy or otherwise. What does determining whether types are homotopy equivalent get you in terms of ML applications?

1

LazyHater t1_ixjq0v8 wrote

In "laymans" terms, it gives you a) an environment for ML models to verify their proofs and b) and rich space for ML to study relations between different fields of mathematics, logic, philosophy, ethics, and everything else by default at that point.

propositions are implementations of types so just being able to say when propositions are equivalent in like a rigorous way is good for science anyways.

3

bluecat1789 t1_ixk81sb wrote

Are your works similar to this series of lectures on Categories for AI: https://cats.for.ai/ ?

Overall I think this trend is quite nice, and I look forward to new ideas from the trend.

2

LazyHater t1_iyeaom6 wrote

Yes and no. The fundamental ideas, once they start to sink in, show clear parallels between vastly different fields of analytic thought. The more you understand the framework though, the more its limitations can be concerning. Dependence on the axiom of choice, for example, and the naturality of choice in the field itself, leads some to speculate that if contradiction can be chosen true, the theory's implementation (with the vast majority of categorical proofs appealing to choice) is completely broken.

It's overwhelming at times how applicable category theory is from the right perspective, but underwhelming how its implementation in set theory can be expected to pan out.

tl;dr: category theory is dope but aoc is sus

1