The Choice of Color
and the absolution of responsibility and choice in computer science approaches to art + design
Atop both of the desks where I work, at home and in my office, lie art books of Ponyo, a Studio Ghibli movie loosely inspired by the Little Mermaid. As I work, I always see the titular Ponyo within my periphery, a chibi magenta-clad mermaid peeking out of a bucket. These books were gifts from someone beloved to me when I moved from California to New York, and I flip through them when I get fatigued during the work day, to remind myself that I am not alone in my PhD.
Whenever I open the book, I like to relax my eyes over stills from the film, to take in the idyllic blues of the animated Atlantis designed by Studio Ghibli and to study the process by which their watercolors became film. However, beyond the pretty pictures, I like to think about something that struck me the first time I read the artbook of Ponyo. I had been reading an interview where Michiyo Yasuda, a chief color designer behind Ponyo, discussed her choices of color. In particular, I remember her discussing how the home Ponyo stays in, a cozy cottage on a cliff was originally drawn with a modest and earthy brown roof. Then during revisions, the color of the roof was revised to take on a terracotta, almost coral color. Yasuda discussed how this decision livened the landscape and added to the fairytale ambience of Ghibli films.
This detail, minor as it was, was a reminder to me that most art is a matter of care and deliberate choice. As a computer science PhD student in the field of computational design, I also often make decisions about hue, composition, and shape. However, I feel like I do not make explicit choices as often, often relegating the final call to math, stochastic functions, and programmed randomness. For example, during my first year of grad school,. In this project, color was keystone; color was what established depth and form in my otherwise blobby, noisy pointcloud animals. (They were colored based on math functions I modified from Felix Herbst, who I had collaborated with while I worked for the design agency prefrontal cortex.) Through math, they took on saturated hues like that of Skittles and confetti, an effect that gave the shapes an accidental beauty. When users played with the system, altering sliders and jumping through the design space, the shapes became digital pinatas, exploding and rezipping again and again into new amorphous forms.
I would always get curiosity from pilot users of my system about the application of color in my project. "Does the color mean something? Is the color just for aesthetics? It looks nice. Color made the experience more delightful." To these remarks I would always answer, "It was colored by math."
There was something fundamentally different about how Yasuda and I explained our choice of color. In the Art of Ponyo, Yasuda owned the decision one hundred percent, taking pride in the care and caution that went into the choice of a terracotta roof and how it completed the effect of paradise. On the other hand, I never felt as though I could do the same. "It was colored by math" ultimately implied "it wasn't my decision." I have thought about this for a while, and I now attribute this difference to the way computer science has positioned me to approach arts and design—through abstraction. When I program functions, I wrap logic together such that choices become automated ones privy only to logic flow through the computer. I abstract away choices, absolving myself of the responsibility to decide on things as deliberate as the color of a roof or the color of a particle.
I know that traditional artists, for example Studio Ghibli, have a lot of friction with such approaches. I surmised as much given how many times in the Art of Ponyo they referenced the importance of hand-drawn ocean waves and how the studio as a whole had resisted computer graphics for decades. But I found this sentiment acutely summarized when I watched a Youtube video titled "Hayao Miyazaki's Thoughts on Artificial Intelligence". In this video, a group of programmers presented Miyazaki with an AI-animated zombie. Exuberantly and with a bit of bravado, a lead programmer elaborated to a crowded conference room of people about the potential of AI-driven animation and video games potential as his zombie avatar convulsed on the screen. It moved in a macabre and humorous way, using its chin as a fulcrum to move its entire body. The camera then panned to Miyazaki, who resolutely stated, "I would never wish to incorporate this technology into my work itself. I strongly feel that is an insult to life itself."
The camera panned to back to the lead programmer, in a silent zoom so awkward it felt like a scene out of The Office. The video then cut to Studio Ghibli director Suzuki, who asked the programmers what their ultimate goal was—"Well, we would like to build a machine that can draw pictures like humans do." The video finished with a sombre Miyazaki saying while sketching, "I feel we are nearing the end of time. We humans are losing faith in ourselves."
I found this video hilarious and painful at the same time, as someone who really related with the programmers.I have constantly been the lead programmer in the video, that evangelist extolling some emerging tech to outsiders. In fact, what the second programmer said hit really close to home—their research ambition and my current one were one and the same: to work on image-making machines. I was cringing yet cracking up when I realized that by transitive property, Miyazaki and company would probably consider my work an insult to life itself.
I have been reflecting on why such work could be construed as insulting. Obviously, at the surface-level, presenting a grotesque 3D data marionette to a renowned animator known for 1) eschewing 3D technology and 2) family-oriented animated movies was a setup for failure. Many comments on Youtube touched upon this. But I think that at a higher level it comes back to the automatic attention to choice, and how AI really is the apex of this. One literal way to interpret his comment about "humans losing faith in ourselves" is that we do not have the confidence in ourselves to make choices ourselves.
The more I have thought about it, the more I have realized that the abstractions we build to put distance between ourselves and choice is a fundamental tension within many areas of computer science. One example is crowdsourcing, when we aggregate data from large numbers of people online we call "crowds". Often we build datasets for AI algorithms through the crowdsourcing of image labels or simple question-answers. When the algorithms end up biased or their accuracy runs too low, we claim that there wasn't enough data or that there was bias inherent within the crowds. The scale, engagement, or skill level of the crowd were not enough—go collect more. The faults of the program never fully lie with the programmers, because just as the tasks outsource the activity of deciding to faceless crowds, they also outsource responsibility.
The most recent advancements now forgo human-curated datasets for Internet scale archives, passively inhaling the results of our choices without explicitly asking anyone for any more. I have been playing with some of the state-of-the-art algorithms for text-to-image generation, which were trained by scraping the Internet for images and their captions. The result is that we are now able to produce results that I am sure are far beyond what the Japanese programmers imagined when they said they wanted to make machines draw images as well as people. We can now type anything and have a computer generate a coherent image to match the text as a caption. These algorithms have become part of my research agenda, and in demoing some of the results of these algorithms to a few students in my PhD cohort, I have met the same type of criticality and resistance that Miyazaki once gave that lead programmer.
A fellow student once asked me, "Why do what these algorithms generate deserve to be called art?" I remember not having an answer, and saying something weak to the effect that since artists on the Internet have engaged with these algorithmic generations as art, we as researchers were following suit. In search of an answer and fatigued from looking at so many parodies of Picasso, Monet, and other real artists from these text-to-image algorithms, I started going to the Metropolitan Museum of Art. When you see so many fake things, you want to see real things. Instantly, I loved the Met the way many visitors do, in utter awe over the incredible compression of time, history, and culture into one museum.
I walked around the wings, for the first time impressed that everything came with a caption, an artist, and a date. I learned about how ancient Chinese painting lauded the reclusive scholar, how Egyptian hieroglyphs actually had a cursive form, and how Western painters once captured the dance rituals of Native Americans as outsiders. I walked by scarabs, painted porcelain vases, and gold filigree crowns— artifacts dating thousands of century B.C. that looked as though they could have been made today. And in these blurbs captioning these pieces, I read stories. For example, in an ancient Chinese landscape of a waterfall, I was directed to study a lonely figure hidden under the rush of clouded ink, a representation of how reclusion from the ills of society and seclusion in nature was an ideal in the past. In the wing dedicated to a French art who depicted the dance of the Elem Indian Colony, a Native American tribe that had its origins in California, I had to do a double take to see two shadowed Western figures in the background of the painting, two voyeurs in a ritual dance. I think it is amazing to think that all of these pieces, all of art, in the museum and out of it, are a composition of choices, from corner to corner of the canvas, contour to contour of sculpture. Even looking up at frescoes of battle scenes in the European Paintings section, I would see the detail that went into the sunsets in the background. The skies bled in their own right, heightening the drama of the battles. Again, my refrain about the deliberate choice of color holds.
I think the Met showed me and what it echoed for me from the Art of Ponyo was that deliberate choice and labor over every aspect of the canvas was what gave these pieces value and what made them art. Perhaps that is the answer as to what makes art art and AI-driven or computer-assisted generations lesser than art to some. Art exists as canvases spun from a cultural fabric; their images yield captions, text that describe stories, movements, and histories. Their images are not mere images, which is as far as the best of computer programs have gotten. We may now be able to go programmatically from text to image, but we can't do what visual art has always done, which is to use images as a medium: a channel for story and thought. A large part of this is because we haven't opened up enough places within these algorithms to let us make choices as simple as a choice of color. That would be a first step toward creating a story of choices behind each generation, a step towards making them truer forms of art.
I created a system embedded with artificial intelligence that produced animal shapes which I endearingly refer to as my gummy bears and chicken nuggets project.
In fact, AI-generated motion and animation had even been one of my interests since my undergrad at UC Berkeley. I remember spending many afternoons at cafes, trying to figure out how to make a 3D mannequin learn a dance from a Youtube video. Also, right before I began grad school, I spent my free time making dancing AI avatars for my dance teacher aunt, laughing with her at the imperfect, contorted results. These data marionettes danced with the same jitter and macabre disrespect of physics as what the lead programmer demoed to Ghibli.