Cinematographic Techniques in Architectural Animations and Their Effects on Viewers' Judgment

Eui-Jee Hah 1, Peter Schmutz 1,*, Alexandre N. Tuch 1, Doris Agotai 2, Martin Wiedmer 3, and Klaus Opwis1

1 Department of Psychology, University of Basel, Switzerland
2 Department of Architecture, ETH Zürich, Switzerland
3 Institut für Design- und Kunstforschung IDK, FHNW, Basel, Switzerland

Computer-generated animations have become a commonly employed medium to communicate architectural designs and projects. Because designers of animations are not constrained by real-world conditions and do not share the rich history of film, they do not readily benefit from the body of cinematographic techniques that filmmakers can draw upon. Specialists argue that this results in unappealing, lackluster animations that could be vastly improved by the application of filmmakers’ craft knowledge. The aim of this study was to identify which aspects of film craft show the most promise by systematically examining the use of cinematographic techniques in animations and their effects on viewers’ evaluations. Our analysis of award-winning architectural animations established average shot length as a reliable and valid predictor for determining participants’ judgments of salience, vividness, and diversity. A shorter average shot length resulted in more favorable ratings, while longer shot rates led to the opposite outcome. We consider these findings from a broader filmic perspective and discuss them in light of their usefulness for designers and the field.

Keywords – Architecture, Animation Design, Viewer Judgments, Satisfaction.

Relevance to Design Practice – Data presented in this article suggest that scene duration (average shot length) in architectural animations is a key design feature that impacts viewers’ judgments.

Citation: Hah, E.-J., Schmutz, P., Tuch, A. N., Agotai, D., Wiedmer, M., & Opwis, K. (2008). Cinematographic techniques in architectural animations and their effects on viewers‘ judgment. International Journal of Design, 2(3), 29-41.

Received October 10, 2008; Accepted December 15, 2008; Published December 31, 2008

Copyright: © 2008 Hah, Schmutz, Tuch, Agotai, Wiedmer, & Opwis. Copyright for this article is retained by the authors, with first publication rights granted to the International Journal of Design. All journal content, except where otherwise noted, is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License. By virtue of their appearance in this open access journal, articles are free to use, with proper attribution, in educational and other non-commercial settings.

*Corresponding Author: peter.schmutz@unibas.ch

Eui-Jee Hah, B.Sc., is a graduate student in the Department of Cognitive Psychology and Methodology at the University of Basel, Switzerland. He has a background in cognitive science and media studies and his work concerns itself with the intersection where science and art collide.

Peter Schmutz, M.Sc., is a researcher in the Department of Psychology, University of Basel. He has a background in Cognitive Science, and his work focuses on Human-Computer Interaction and Usability.

Alexandre N. Tuch, M.Sc., is a Ph.D. student in the Division of Cognitive Psychology and Methodology within the Department of Psychology at the University of Basel, Switzerland. He has a background in cognitive science, and his work focuses on human-computer interaction within the area of aesthetic perception and psychophysiology.

Doris Agotai (Dr. sc. ETH, dipl. Arch. ETH) is a lecturer in the ETH Zurich (Dept. of Architecture) at the Zurich University of the Arts ZHdK (MAS Scenography) and a senior researcher in Applied University Northwestern Switzerland FHNW (Institute of 4D Technologies). Her research focuses on spatial perception and virtual environments. Recent Publications: Playtime - Starring Architecture in Peripheral vision and collective body, Museion - Museum for modern and contemporary art, Bolzano 2008, p. 281–30; Architekturen in Zelluloid. Der filmische Blick auf den Raum, Bielefeld 2007.

Martin Wiedmer, Professor, is an architect. Since 2000, he has been a lecturer for CAAD (Computer Aided Architectural Design) and New Media in the Institute of Interior Design and Scenography at the Academy of Art and Design HGK, University of Applied Sciences FHNW in Basel. He has been the head of the Institute for Research in Art and Design IDK in the same university since 2006. 2001 Feasability Study «Pallazzo Multimediale Castelvetrano (I) ». 2002/2003 «Metaworx – Approaches to Interactivity». 2004-2008 Board Member of the Swiss Design Network (SDN). 2004/2005 Research project «Intelligent skin – The fourth dimension». 2005/2006 Research project «Compositing spaces». 2006–2008 Research project «LifeClipper2».

Klaus Opwis is a full professor of Cognitive Psychology and Psychological Methodology  in the Department of Psychology at the University of Basel, Switzerland. He has many years of research experience, especially within the areas of cognitive psychology (memory and learning, perception, thinking and problem solving) as well as human-computer interaction. He has published over 70 research papers in articles and books.


In recent years, advances in 3D computer modeling and animation and the growing availability of commercial software geared for these applications have led to the widespread adoption of computer-generated animations among architects wishing to communicate their designs. Although their use has become commonplace, computer-generated animations are a relatively new resource for the representation of architecture, and therefore architects and designers still lack the experience to exhibit buildings through moving images (Alvarado, Castillo, Márquez, & Mayorga, 2005). Moreover, as the software tools commonly used in the making of these animations are targeted for generic use, they do not possess specific aids for architectural visualizations (Alvarado & Isorna, 2004). Specialists have lamented the lack of sophistication in the way these animations are presented, stating that they are usually straightforward and uninspired tours of buildings without much appeal—while examples that do not follow this trend exist, they are the exception rather than the rule (Alvarado & Isorna, 2004; Alvarado et al., 2005; Knox, 2005; Ng, Schnabel, & Kvan, 2006; Wiedmer, Agotai, Lenzin, & Kempter, 2006). In an attempt to remedy these shortcomings, these authors shift their attention to traditional film in general, and to specific cinematographic techniques in particular, as a promising source of inspiration. Indeed, the value of applying an analysis of film to animation was recognized by Hochberg (1986), who promoted the study of film techniques to aid the then emerging technology of computer-generated imagery. He argued that filmmakers have the advantage of being able to point their cameras at real-world events, and thus implicitly captured in the resulting film many of the constraints on object construction, appearance, and behavior that our visual systems might make use of. Computer-generated imagery, on the other hand, had no such constraints—within the confines of the given technology, anything could be portrayed according to any rule (for discussion, see May, Dean, & Barnard, 2003). Adding the time dimension meant that designers of animations were confronted with a whole new set of problems that filmmakers did not have. The work of the animator is less a recording of events as they unfold (however elaborately staged these may be) but a simultaneous creation and portrayal of these events, where no formal, physically-determined standards exist. In comparison, film as a medium has been in existence for over a hundred years, and filmmakers have amassed a well defined set of “rules” that govern the proper use of cinematic techniques (Bordwell & Thomson, 2004; Monaco, 2000). They have learned through one century of experimentation what forms of dynamic scenes are easily comprehended by their viewers and which are not. In the process, film has acquired a language of its own, a so-called “film grammar” which embodies this extensive corpus of craft knowledge. Because much of the filmmaking advice is directed toward rapidly and succinctly conveying narrative information or on leading the viewer to infer motive and intention, it is difficult to extrapolate the craft knowledge therein to the more abstract motives at play in architectural animations. Nevertheless, attempts have been made to distill this film grammar into recommendations that can be applied to the various stages of architectural animation production (Alvarado, et al., 2005; Alvarado & Isorna, 2004; Knox, 2005; Ng et al., 2006; Wiedmer et al., 2006). These studies have, for the most part, focused on certain aspects of films or on specific cinematic techniques and attempted to reproduce these in the animation domain. However, the case-by-case approach taken by these authors to transfer knowledge between the domains lacks insights that can be readily generalized, making the (at times anecdotal) advice offered difficult to justify. While this has been recognized as a problem (Alvarado et al., 2005), more strikingly, there exists no objective measure with which the results of the application of these techniques can be validated. Although subjective judgments by the designers of the resulting animations may point to an improvement over “non cinematic” animations, no systematic links to actual perception and assessment by the viewer can be established with this method. This study seeks to correct for these deficiencies by doing two things: 1) providing an analysis of the cinematographic techniques employed in architectural animations based on objective, comparable measures, and 2) systematically examining their effects on viewers’ judgments of the animations.

Analyzing Cinematographic Techniques

Alvarado and Isorna (2004) performed a comparative analysis of various scenes depicting architecture from eight classic movies, six documentaries on buildings, and seven award-winning animations. They examined the productions at three levels of film grammar: the framing or composition of the image, the nature of the particular camera shots and camera movements that were used, and the editing or montage of these shots to a coherent whole. With this information, the authors constructed detailed two-dimensional plans of the structures that were portrayed, noting points of views and detailing camera motion and flight-paths through and around the buildings. Unsurprisingly, they discovered that movies employed far more cinematographic techniques than animations: much more attention was paid to the careful composition of each shot and the rhythmic pace of the presentation was determined by strategically placed cuts between shots. On the other hand, animations tend to make more use of tracking shots than movies or documentaries, where the camera is continuously displaced in relation to the line of sight, for example during a fly-through of a room. They also featured less cuts than the films in the same timeframes, resulting in a longer average shot length than films.

Because the sample size was limited to only seven animations, it is hard to gauge the external validity of these findings. However, they are interesting to note, because they offer valuable clues for further research. The current study attempts to achieve a high ecological validity by including a larger sample of animations in the analysis. Because extensive dissections of cinematographic techniques, such as Alvarado and Isorna’s (2004), can lead to a large body of conclusions with varying degrees of precision, it was our decision to focus on concrete, objective measures that enable ready comparisons. Thus, we decided against performing the analysis of framing and composition for each single shot, which are both often dependent on artistic cues that are neither obvious nor accessible to objective, reliable examination. Instead, we shift our attention to relevant, cinematographically influenced technical aspects whose observation produces low to nonexistent margins of error; these are total length of the animation, total number of shots, average shot length, total length of tracking shots (camera displacements), and the duration of tracking shots as a percentage of the entire animation. Although we sacrifice the richness of information that could be gained by employing more exhaustive, qualitative methods, the advantages to our approach are evident and compelling: all of our selected properties can either be assessed quite easily and reliably by a trained eye, or computed from the other properties (and in the case of total length simply taken note of), and their notation in discrete units (e.g., seconds or percent) allows instant comparability.

Measuring Viewers’ Judgments

To assess the effects of the properties on the judgments of viewers, it is first necessary to define meaningful criteria by which the animations should be assessed. Ideally, these criteria should be informed by the intended purpose for which the animations were created. This, however, can be difficult to determine. The reason becomes clear when we compare the application areas of the two domains: in their most common form, films usually serve a narrative and/or an entertainment purpose, filtered by commercial and artistic interests. On the other hand, a variety of competing influences enlightens the production of architectural animations. Animations can be made to demonstrate certain design aspects of a structure, persuade the purchase or use of a building, summarize a whole project, express the capabilities of a digital model, the working team and/or the client, and so on (Alvarado et al., 2005). Despite this apparent wealth of design purposes, some common ground can be found in what animations should achieve in most, if not all, of these situations. Because persuading interested parties to purchase a building is both a compelling and common use of architectural animations, we used this as a guiding scenario for the formulation of our key criteria. This scenario possesses some obvious parallels to advertising, with television advertisements in particular sharing some of the challenges animations face when trying to solicit business and influence purchasing behavior. In this sense, research from that field can prove to be useful in informing what criteria are most adequate as measures for successful animations.

In their analysis of the effects of video montage on the appeal and persuasiveness of television advertisements, Larsen, Wright, and Hergert (2004) proposed that responses to complex stimuli were mediated by two factors: boredom and confusion. Their experiment could find some evidence for their hypothesis: the number of cuts in an advertisement had an influence on viewers’ reports of boredom and confusion—however, the effects did not always point in the expected direction. Of more immediate interest to our discussion is that they found evidence for the validity of these two proposed dimensions in the answers of the participants. Based on these findings, boredom and confusion seem to be promising concepts that, in our case, could be adapted to function as variables for gauging viewers’ judgments of animations. However, considering the broad nature of these concepts, it is likely that they lack the precision to adequately capture viewers’ reactions to the intricacies inherent in the abstractness of architectural animations. Also, because of the wide range of intents and purposes that animations must serve, we feel that more dimensions are needed to account for these aspects (even if our focus rests on the sales scenario for the time being).

Kardes (2002) mentions salience and vividness as attributes of stimuli which can be instrumental in guiding the attention of consumers. Salient information stands out from a particular context or background, whereas vivid information always stands out, regardless of the context in which it is perceived. Vivid information is “emotionally interesting, concrete, and image-provoking, and has immediate or direct implications for the decision maker” and as such “stimuli that are vivid to one individual may not be vivid to another” (p. 361). While this definition complicates objectivity for vividness, the distinction between the two aspects is a useful one. It is reasonable to assume that the influence of these factors leads to noticeable differences in the animations that can be measured in their reception and appraisal by viewers.

A further aspect of importance in connection with animations is their purported veracity in portraying the project or building in question. While it is seldom the only goal of designers to show as realistically accurate a model of whatever it is they are animating, common sense dictates that this facet probably receives more attention when brought in conjunction with sales purposes.

To sum up, we use these various dimensions to define the criteria for evaluating the “success” or effectiveness of animations. These criteria, in turn, guide the construction of a measuring instrument for determining participants’ evaluations of the animations.

Experiment 1



Forty undergraduate psychology students at the University of Basel participated in the experiment. The sample consisted of 26 females (M = 20.2 years, SD = 2.4; range = 18-26) and 14 males (M = 23.1 years, SD = 2.9; range = 20-27). Participants received course credit for their participation.


Twenty computer generated architectural animations were chosen as stimuli from the Animago DVDs (Digital Production GmbH, 2002, 2004, 2005). The Animago Award is a highly regarded contest for digital content creation in German-speaking Europe with contestants taking part from all over the world, and DVDs featuring some of the best award-winning material are released yearly. Key criteria for inclusion in the study were perceived technical quality, creativity, and realism. These criteria were informed by various aspects, including (but not limited to) the level of sophistication and detail of the 3D models, the originality of the presentation, and a general overall impression of accomplishment and excellence. These criteria were judged based on subjective impressions of the authors. Our selection aimed to maximize representativeness and ecological validity. For this reason, the 20 animations that were included covered a broad range of thematic content, depicting various kinds and styles of architecture, from old cathedrals to contemporary shopping malls and office complexes. Despite the heterogeneous nature of the content and the wealth of styles, the selection process achieved a level of consistency on the technical level by eliminating cases with quality that was obviously insufficient.

For practical reasons, the animations were extracted from the source DVDs with zero to minimal scaling from their native resolution and compressed as MPEG-4 video files with the Xvid codec version 1.1.2. A high variable bit-rate was used to ensure that no compression artifacts would impact the image quality. Indeed, close visual examination of the resulting video files indicated that they did not differ from the source DVD in any noticeable way. When an animation was longer than three minutes, a typical excerpt of one to three minutes was selected according to the subjective judgment of the authors. Great care was employed to ensure that no ‘artificial cuts’ were introduced as a consequence of the selection, i.e., shot boundaries as they appeared in the original animations were left intact. Apart from the compression, the animations or the chosen excerpts were not altered in any way. To avoid possible confounding, all animations were presented without sound.

We noted cinematographic and technical properties for each of the animations; these were total length in seconds, total number of shots, total length of tracking shots in seconds, video file size in kilobytes, and resolution in pixels. Shot boundaries and tracking shots were judged by the authors by closely examining scene transitions and taking note of the time codes of their occurrence and duration. In addition to these absolute values, we computed the following relative values: average shot length, total length of the tracking shots as a percentage of the total length of the animation, and file size in kilobytes per thousand pixels.


We asked participants to rate the animations on the dimensions of vividness, salience, diversity, realism, and comprehensibility by means of a 20-item questionnaire. In the questionnaire, the dimensions were assigned three items each, apart from realism which was covered by two items. Also, in addition to these 14 items, six items asked participants to rate the animations on overall creativity, interestingness, and the quality of the content and presentation (see Table 1 for an overview). Items were presented as full sentences to which the participants indicated their level of agreement by means of Likert scales ranging from 1 (not at all) to 6 (strongly agree). We arranged the items in random order, apart from the six overall ratings that the participants completed at the end of the questionnaire. All participants received the same version of the questionnaire.

Table 1. Dimensions and their respective items

Design and Procedure

To reduce the length of the experiment, the animations were randomly divided into two sets of 10 animations each; an individual participant rated one set only. On arrival in the laboratory, participants were randomly assigned to one of the two sets, ensuring that each animation was rated by 20 participants. After introducing the participants to the setup, the experimenter moved to the adjacent control room where he could observe the participants through a camera and communicate with them by intercom. The animations were shown one after the other on a computer monitor in a controlled setting. After each animation was presented, participants were asked to rate it immediately by means of the questionnaire. In addition, the experimenter conducted a brief semi-structured interview with each participant after each of the final three animations of the respective sets. On average, participants completed the session in approximately 50 minutes.

Statistical Analysis

All data was checked for normal distribution. Three of the 20 animations displayed extreme values in their respective number of shots which exceeded three standard deviations from the mean, resulting in outlier effects and also skewing the distribution of participants’ rating data. These animations were excluded from the analysis, according to standard procedure for dealing with outliers of this nature. Table 2 shows the descriptive statistics for the technical properties after exclusion of the outliers.

Participants’ ratings of the animations by means of the questionnaire were grouped and aggregated for each animation according to their respective dimensions, so that each animation had six composite values (one for each dimension plus one for the overall rating). The limited sample size of 17 animations for 20 total items discourages the use of factor analysis to test the validity of the dimensions. However, the high intercorrelation of the items for their respective dimensions indicates that their aggregation through equal weighting is acceptable. Subsequent reliability analysis of the questionnaire exhibited a Cronbach’s Alpha between 0.82 and 0.95 for each of the 17 animations, indicating that participants’ ratings of the animations were in general highly consistent.

Table 2. Descriptive statistics of technical properties

For testing the degree of linear relationship between the variables (observed and calculated values of the technical properties and the ratings of the participants), Pearson Moment Correlations were calculated. An alpha level of 0.05 was used for all statistical tests.

Results and Discussion


For the observed absolute values, total length was significantly related to diversity (r = -.58, p < .05) and comprehensibility ratings (r = .56, p < .05). Apart from the relationship between total length of the tracking shots and realism (r = .49, p < .05), no other relationship was statistically significant. For the computed relative dimensions, only the correlation between average shot length and salience (r = -.58, p < .05) and the overall ratings (r = -.53, p < .05) reached statistical significance. Table 3 summarizes the correlations.

Table 3. Correlation coefficients for technical properties and participants’ ratings

Total Length and Average Shot Length

These results indicate that two factors are key to moderating participants’ ratings of animations: total length and average shot length. The positive correlation between total length of the animation and the comprehensibility dimension means that the longer the duration of the animation, the more likely it is for participants to rate it as being easier to follow or understand. This makes sense if one considers that a longer duration means more time can be utilized to portray the material, lengthening the exposure of the viewers to the content and providing them with more opportunity to process it. On the other hand, the negative correlation between length and diversity suggests that participants were more likely to view a longer animation as less diverse or entertaining. At first glance, this also seems plausible, because attention spans are likely to be tested as durations progress. However, previous research on narrative films (Carroll & Bever, 1976; Kraft, 1986; May et al., 2003) and television advertisements (Larsen & Hergert, 2004) leads us to presume that these apparent feelings of boredom can be mitigated by employing the appropriate cinematographic techniques—and verifying this assumption for the animation domain is a stated goal of this study. So although the effects of total length on participants’ perception are interesting to note, it would be wise to view these results with caution.

The results of the average shot length are also telling: the negative correlation between average shot length and salience indicate that the shorter the average time between two shots, the more likely it is for participants to rate the animation as special and less ordinary. A similar statement can be made for the overall ratings: the shorter the average shot length, the better the animation scored on these overall ratings. Because the overall ratings in particular cover broad and possibly divergent concepts, a separate analysis on the level of the discrete items can be useful. Table 4 shows the correlations for the items that comprised the overall ratings.

Table 4. Correlation coefficients for overall ratings

Given these results, it can be said that the shorter the average shot length of the animation, the more enthralling, impressive, creative, and more technically accomplished it was judged to be. In our view, this finding is consistent with the results of Kraft (1986), who observed that the rate of cutting (and thus average shot length) in a narrative film sequence had a profound impact on viewers’ perceptions. Participants in that study rated action sequences with cutting as more interesting, more active, stronger, and quicker than sequences with no cuts.

Because average shot length is nothing more than total length divided by the number of shots, one could expect the two variables to be related. However, this proved not to be the case: total length and average shot length show no relationship (r = -.09, p < .75). (Indeed, considering that the total number of shots for all animations follows a normal distribution, this result is not surprising).

Tracking Shots

The significant relationship between total length of tracking shots and realism is harder to interpret, since taken at face value it would suggest that these shots lend themselves well to building a sense of realism for the viewer. In films, tracking shots can be used for dramatic effect or to provide special emphasis to a scene. When employed in an unobtrusive manner, they can aid the viewer in comprehending the locus of the action. Camera displacements can be executed in several ways and at differing velocities, and can be combined with rotations in a single continuous sequence as the needs of the specific situation dictate (Bordwell & Thompson, 2004). Because of this flexibility in employment and the varying resulting effects on the viewer, a detailed analysis of the included tracking shots is difficult.

For our sample of animations, it is important to note that the correlation between total length of tracking shots and total length of the animation nearly approaches significance (r = .46, p = .064); this relationship is echoed by the almost significant relationship between total length and realism (r = .48, p = .052). Considering that the possibly more meaningful relative measure of the percentage of tracking shots in the animation (as opposed to their absolute total length) did not show a significant relationship with the realism dimension (r = .20, p < .50), it is likely that this initial correlation of total tracking shot length and realism is merely a byproduct of the correlation between total tracking shot length and total length of the animation. In other words, the longer the animation, the more realistic it was perceived, while simultaneously containing more tracking shots. It seems that architectural animations by their very nature make more extensive use of tracking shots than is usual in traditional films, as real world constraints for camera displacement do not apply to the virtual environment. Alvarado and Isorna’s (2004) comparison of film and animation sources supports this assertion, where the authors found that over half of the shots used in the animations they analyzed featured camera displacement or motion (and acknowledged that it is not unusual for this figure to be much larger), although this occurred in only one fifth of the film shots. Indeed, the average total length of tracking shots in our sample was 74% of total length (see Table 2). At the current stage, it is hard to draw more specific conclusions without a more detailed categorization of the tracking shots that were employed in the animations. Given the complex nature of tracking shots and camera motion in general, it is conceivable that their total length alone is not a suitable measure to adequately account for this phenomenon.


The short interviews that we conducted at the end of the experiment gave us insight into certain aspects on how the participants perceived the animations that were not covered by the questionnaire. Participants often saw in the animations a similarity to various other forms of media or art. Often participants compared certain animations to music videos and computer games, although the animations in question were neither related to the former or the latter. It stands without question that computer games are becoming more and more sophisticated in their presentation and level of technical accomplishment and their broad acceptance as a form of entertainment is reflected in our sample, where almost everyone indicated that they played computer games at least on a casual basis.

Further Procedure

Experiment 1 could establish two factors as exerting the most influence on participants’ ratings of computer generated architectural animations: total length and average shot length. While it is interesting to note that total length seemed to greatly influence viewers’ evaluations, as previously stated, caution is advised in interpreting this outcome. It is conceivable that duration has a moderating effect on the effects of the other cinematographic properties and that their influence on the dimensions was diminished as a result. Experiment 2 aimed to further investigate this phenomenon with an expanded set of stimuli. Moreover, the practical value of total length as a standalone variable is questionable. In practice, a designer is likely to be constrained by the maximum length of an animation (as imposed by the needs of the project or the client) but is allowed more liberty in how she makes use of this time. In this context, we deem average shot length to be of more practical relevance than total length in that it is more amenable to active manipulation by the designer.

Hence, for Experiment 2, we shifted our attention to average shot length as the primary determining cinematographic technique. Based on the results of Experiment 1, we hypothesize that there is a significant correlation between the average shot length of an animation and how salient it will be judged. Specifically, we expect animations with shorter average shot lengths to result in more favorable evaluations of these animations. Also, the results hint at the possibility that the length of tracking shots influences viewers’ perceived sense of realism. Experiment 2 attempted to verify these assumptions by systematically ordering animations according to average shot length and testing the degree of linear relationship with viewers’ ratings, while also taking the remaining cinematographic properties into account.

Experiment 2



Forty psychology students at the University of Basel participated in the experiment. Twenty-eight were undergraduates, and 12 were graduate students. The sample consisted of 23 females (M = 23.9 years, SD = 4.34; range = 20-31) and 17 males (M = 26.2 years, SD = 5.91; range = 20-40). None of the participants had taken part in Experiment 1. Participants received course credit for their participation.


Thirty computer generated architectural animations were chosen as stimuli; 15 of these animations were taken from the ones that we presented in Experiment 1. In addition, we selected 15 new animations that were not used in the previous experiment. Included in this new set were stimuli that were also used in a parallel study where the focus was on so-called sequence shots. Sequence shots are uninterrupted long takes that often involve sophisticated camera movement. By employing single continuous sequence shots, these respective animations were tokens of a technique that has found particularly heavy use in architectural animations. Five of the 15 new animations were not taken from the Animago DVDs (Digital Production GmbH, 2002, 2004, 2005, 2006) but were obtained from other sources, in part from the respective architects or animators themselves. These animations were natively encoded in either the Windows Media Video or MPEG-1 Video format and bore similar resolutions and sharpness as the other animations. On a technical level, their visual quality was comparable to the animations that were taken from the DVDs. The Animago animations were extracted from the DVDs in the same fashion as in the previous experiment. Although the main focus of our attention was on average shot length, we included other cinematographic aspects in our analysis, namely: total length, number of shots, total length of tracking shots, and length of tracking shots as a percentage of the total length. We decided to exclude the resolution and file size of the animations, as these basic technical attributes showed little promise from Experiment 1.


To reduce the length of the experiment, we decided to use a shorter version of the questionnaire we used in Experiment 1. We conducted an item analysis of the original questionnaire and although consistency was high (with Cronbach’s Alpha ranging from 0.82 to 0.95), we excluded the items that showed the lowest discriminatory power. The resulting new questionnaire comprised a total of 11 items which mapped to the dimensions in the following manner: salience was covered by three items, vividness and comprehensibility each by two items, and diversity and realism each by one item. In addition, two items asked participants for their overall ratings of creativity and interestingness. Again, the first nine items were put in random order with the two overall ratings at the end of the questionnaire. All participants received the same version of the questionnaire.

Design and Procedure

We ordered the animations according to their respective average shot lengths into three groups of ten animations each: short, medium, and long. We generated 40 sets that each comprised 15 animations; an individual participant rated one set only. For the first 20 sets, five random animations were picked from each of the three groups and arranged in a separate random order for each set. The next 20 sets were built in the same fashion but comprised of exactly those animations that were not included in the previous 20 sets. On arrival in the laboratory, we randomly assigned participants to one of the 40 sets, so that each animation was rated by 20 participants. As in Experiment 1, the animations were shown one after the other on a computer monitor in a controlled setting. The experimenter sat in the adjacent control room for the length of the experiment where he could observe the participants through a camera and communicate with them by intercom. After each animation was presented, participants were asked to rate it immediately by means of the new questionnaire. On average, participants completed the session in approximately 50 minutes.

Statistical Analysis

All data was checked for normal distribution. Four of the 30 animations displayed an average shot length that exceeded three standard deviations from the mean, the reason being that these animations were respectively comprised of a single continuous sequence shot, with no cuts within them. As previously mentioned, they were included in the present study for reasons of consistency with a parallel experiment that deals with sequence shots in a more exhaustive manner. Due to the extreme nature of these four animations—one single continuous sequence shot comprised the entire presentation, effectively forcing parity between average shot length and total length—standard statistical procedure discourages their inclusion. For this reason, they were not considered for the analysis. (In fact, based on the significance of the sequence shot in narrative films as an advanced, cinematographic technique usually employed for dramatic effect, one could argue that this difference constitutes a fundamental, qualitative disparity with the other animations). Table 5 summarizes the technical properties of the remaining 26 animations, ordered according to group.

For testing the degree of linear relationship between the variables, Pearson Moment Correlations were calculated. An alpha level of 0.05 was used for all statistical tests.

Table 5. Descriptive statistics of cinematic properties ordered by group

Factor Analysis

To test the validity of our proposed dimensions, we conducted an exploratory factor analysis with the aim of reconstructing the five dimensions. The overall ratings were excluded from the analysis, as they were conceptually distinct from the other items. Contrary to our initial expectations, the principal component analysis with varimax rotation for the remaining nine items showed a clear two-factor solution.

Table 6. Factor loadings of the items

Note: Varimax Rotation with Kaiser Normalization was utilized. S= Salience; V= Vividness; D= Diversity; R= Realism; C = Comprehensibility.

As Table 6 illustrates, the items for salience, vividness, and diversity all showed exceedingly high loadings on the first factor, with negligible loadings on the second factor. This suggests that the dimensions salience, vividness, and diversity are closely linked and can be subsumed under a comprehensive single factor. The converse was the case for the realism and comprehensibility items, which showed comparably high loadings on the second factor. Accordingly, the second factor reflects a strong association between the realism and comprehensibility aspects of an animation. The salience-vividness-diversity factor exhibited an Eigenvalue of 5.54 and accounted for 61.51% of the variance. The realism-comprehensibility factor displayed an Eigenvalue of 2.98 and accounted for 33.14% of the variance. Thus, the two factors cumulatively accounted for an astounding 94.65% of the total variance. Given the high loadings on the respective constructs, the lack of substantial cross-construct loadings and the strength of the explanatory power in regards to total variance, the two-factor solution appears to be sound.

This two-factor solution is, in fact, reminiscent of the two dimensions of boredom and confusion used by Larsen et al. (2004) in their analysis of television advertisement montage. It could be argued that the components of our first factor (salience, vividness, and diversity) are rightly described as “positives” to the negatively framed boredom dimension, and that the second factor (realism and comprehensibility) constitutes an equivalent opposite to the confusion factor. This is an elegant explanation to these results and meshes well with the theoretical background. With this in mind, we prefer, however, not to refer to the constructs as boredom and confusion factors, as we feel that the descriptive precision of our original components is lost to these titles. Instead, we will keep the original comprehensive titles of the two factors.

Results and Discussion


For calculating the correlations, we took the two factor values and tested their relationship with average shot length. In addition, we tested the relationship with total length, the number of shots, as well as the total length and percentage of tracking shots. Table 7 summarizes the correlations.

Table 7. Correlation coefficients for technical properties and participants' ratings

Average Shot Length

These results confirm our hypothesis: average shot length is significantly related to the salience-vividness-diversity factor (r = -.63, p < .01). The negative correlation means that the shorter the average shot length of the animation, the more noteworthy, lively, and diverse of an impression it made on participants. Thus, the previous relationship between average shot length and salience from Experiment 1 could be reproduced. In addition, due to the integral nature of the vividness and diversity dimensions as constituents of the composite factor, a significant relationship with these latter concepts can be established. In a comparable manner, average shot length was also significantly related to the overall creativity (r = -.61, p < .01) and interestingness (r = -.64, p < .01) ratings—this means that the shorter the average shot length, the higher the participants judged these animations on the creativity and interestingness scales. This is consistent with the results from Experiment 1, where we found a similar correlation between average shot length and creativity. Although we observed a considerable correlation between average shot length and interestingness in Experiment 1, it failed to reach significance (r = -.41, p > .10). The newly observed significant correlation could be due to the expanded set of stimuli where the attributes in question were better represented.

Other Technical Properties

The significant relationship between total length and salience and comprehensibility from Experiment 1 could not be reproduced. There was neither a correlation between total length and the salience-vividness-diversity factor (r = -.22, p > .25) nor with the realism-comprehensibility factor (r = .07, p < .75). Because total length approximated a normal distribution in both experiments, it seems improbable that the cause of this new finding can be accredited to statistical artifacts. Hence, the new result casts further doubt on the validity of our original finding. The highly significant correlation between total number of shots and the salience-vividness-diversity factor and the two overall ratings, on the other hand, mirrors the above mentioned strong relationship of average shot length with these variables. Since total shot length and average shot length are conceptually and statistically related (r = -.64, p < .001), this result does not come as a surprise. From a practical perspective, the relevance of absolute total number of shots as a characteristic is, however, rather limited. Only when put into context with overall length do we arrive at the relative measure of average shot length which is vastly more meaningful. In a similar vein, it is interesting to note that neither total length nor percentage of tracking shots revealed a relationship with the factors or the overall ratings. Again, this leads us to further question the validity of our original finding from Experiment 1. However, as previously noted, our measure of total tracking shot length may not adequately account for the underlying factors at work. For example, this simple measure fails to account for velocity of camera motion, which is considerably more difficult to reliably assess, but which, in the end, could prove to be the far more interesting property. Further research in this area should address this issue.

General Discussion

Average Shot Length as a Function of Pace

In our study we could establish average shot length as a determining cinematographic characteristic of architectural animations in that it had a significant effect on viewers’ evaluations of these animations. To gain a better understanding of why this is the case, it is of interest to examine shot length as a function of the broader rhythmic qualities of film. The issue of rhythm in cinema is enormously complex and therefore poorly understood (Bordwell & Thompson, 2004; Monaco, 2000). Although attempts have been made to formalize pace or tempo in film mathematically by describing the metric properties of the shots in relation to each other as found in narrative films (Adam, Dorai, & Venkatesh, 2000a, 2000b, 2002), there is no comprehensive theory that offers prescriptive advice. Nevertheless, it is clear that editing or montage of shots is one way for filmmakers to control the rhythmic qualities of their film. As Adam et al. (2002) put it, “essentially, the director controls the speed at which a viewer’s attention is directed and thus impacts on her appreciation of the tempo of a piece of video” (p. 474). Faster editing in this context means more cuts are employed in a given timeframe, automatically leading to a shorter average shot length. Just how changes in shot rate are to be interpreted is, however, a far from trivial matter. Adam et al. illustrate this by pointing out that film grammar does not codify content: “For example, a rising shot rate does not equate to a car chase. What it does indicate, in this case, is that the director is doing something; raising the pace, heightening demands on the audience, and this for a purpose” (p. 480). This example highlights the difficulty of establishing a link between apparent cinematographic principles and their exact function in mediating the intentions of the director, as content alone offers little in the way of clues. For our own study, the situation is exacerbated by the tradeoffs that have to be made when looking for an objective, quantifiable measure to link cinematographic principles in comparatively abstract animations to effects on viewers. Although notation of average shot length in seconds produces a concrete number, which enables instant comparability, the pure average is an aggregated measure whose informational value in the general context of tempo is limited. It is impossible to glean comprehensive rhythmic information without a full cataloging of each and every shot in a film or animation and the accompanying notations of their relative positions. The time and effort needed to actually perform this manually every single time would be punitive. The computational model for automatically extracting shots and the pace function shows promise but is still in its beginning stages. Further research could examine the viability of this model for use in architectural animations.

Average Shot Length in Cinema Films

Nevertheless, having discussed the ramifications of contextualizing shot rate into a more general pace function, the value of average shot length as a discrete measure is not to be underestimated. Changes and trends regarding average shot lengths of films have been well documented throughout the history of cinema. Bordwell and Thompson (2004) note that advances in editing and gradual shifts in montage styles in the past decades have led to an overall reduction of average shot rate in films. This trend is clearly visible in Figure 1. The chart details the average shot lengths of over 700 films from the year 1903 to 2006. The numbers were gained from the Cinemetrics Database (Tsivian, n.d.), an internet resource where film scholars and interested members of the public can submit the results of their own analyses of the average shot lengths of films.

Figure 1. Average shot length of films in seconds from 1903-2006, arranged by year.
The total number of films included is over 700, while the number for each specific year varies.

Although the earliest films exhibit average shot lengths in excess of 35 seconds, this rate rapidly decreases in the following decades. With the arrival of sound in the cinema and constant rapid advances in film technology, the years between 1920 and 1950 were a period of great experimentation for the medium and are marked by shifting preferences in shot rate. However, filmmakers started to gradually prefer using longer shots after 1950. The causes for this change are varied and not fully understood (Bordwell & Thompson, 2004). Since then, the pace of cutting has been increased once more, with average shot rates in films approaching the 2 second mark in recent years. In contrast, the average shot length of the 26 animations in our test sample was 15.4 seconds (see Table 5). Needless to say, the difference here is dramatic. While exceptional cases do and always will exist, generally speaking, such a high value has not been seen in the film world since the mid-1980s.

Why architectural animations seem to favor such long shot lengths is unclear, but it is tempting to conclude that designers who lack the craft knowledge of filmmakers are simply behind the times. While this claim in and of itself may be too simplistic of a conclusion, authors who bemoan the paucity of architectural animations in regards to cinematographic aspects other than shot rate (Alvarado et al., 2005; Alvarado & Isorna, 2004) also hint at this direction. Although this assertion may be disheartening, the growing recognition among designers and architects that a transfer of craft knowledge from film to animation is required is a promising and welcome development.

Limitations and Future Research

The main focus of this study was on examining how cinematographic techniques influenced viewers’ subjective judgments of architectural animations. As it stands, our study is exploratory in nature in that we chose to examine relevant cinematographic techniques of interest as they are employed in existing animations. One major advantage of this approach is that our subsequent findings are grounded in real world scenarios. But because our selection of stimuli is necessarily a sample of the available material, at this stage we are unable to determine absolute values on acceptable lower and upper boundaries for average shot length. Although our findings suggest that, generally speaking, shorter average shot length caused viewers to rate animations more favorably, it is obvious that a certain minimum length for shots exists before comprehensibility is compromised. One of the animations that was included in our initial sample but had to be excluded from the analysis for statistical reasons exhibited an average shot rate of 0.8 seconds. Participants rated this animation as confusing and hard to follow. This is clearly an extreme example and hints at a lower limit for shot length that should not be crossed. The next logical step for a future study would be a true experimental design with animations that are tailor made for experimental purposes and that feature, for instance, variable shot rates, different framing of shots, controlled use of tracking shots with varying degrees of camera motion and velocities, and so on. Such an experimental setup would enable one to precisely test the effects of these properties on viewers’ judgment. A further issue that could not be addressed is the usage of sound. In this study, sound for all animations was muted in order to reduce the possible confounding effects of non-visual animation properties. It seems probable that sounds in the form of accompanying music, sound effects, or even spoken guidance and explanations, will influence viewers’ judgments and may further influence viewers’ understanding of the animation. Future research should therefore examine the usage of sound in this context.

Because the participants in our study were neither experts on architecture or animation, we deem their judgments to be reasonably representative of a ‘general’ audience. However, because architectural animations must often serve a broad range of purposes, it is likely that their target audiences are diverse and include lay persons and experts alike. For instance, city planning projects and structures in highly frequented areas are interesting to members of the general public and not just to a group of specialists. For this reason, defining ‘typical’ viewers of such animations can be problematic. With this in mind, a future study could examine whether a different group of participants—one that is comprised of a specialist audience—would rate the animations differently. Findings gleaned from such a comparison could provide useful practical insights for adjusting the design of animations to the needs and preferences of specific audiences. On the other hand, research on the actual cognitive mechanisms that process cuts, scene transitions, and shot length in motion films and animations is limited (see May et al., 2003), and the underlying psychological factors that inform subjective judgments remain as of yet poorly understood. Clearly, this would be an interesting and fruitful area for future research.


In our analysis of cinematographic techniques in architectural animations, we could establish average shot length as a reliable and valid predictor for determining participants’ judgments of salience, vividness, and diversity. We deem average shot length as a fundamental cinematographic property which is objectively and reliably assessable and offers easy comparability. However, further research is needed to appreciate the function of shot rates in the greater context of rhythmic film tempo. Current research in this area is still rudimental and is in need of broader attention.

Finally, some practical considerations: because we have focused on average shot length in our study, a property which results from the montage of shots, it would seem that the insights gained from our analysis most readily relate to the editing or post-production phase of animations (after the main 3D models are made). Although our findings are certainly directly applicable to this stage in development, we stress that this need not be the only area of animation production which could benefit. Actually, it is important for designers to already take into account shot length during the planning stage of animations. Filmmakers usually rely on detailed shooting scripts and storyboards during production to guide them during the shooting process. Generally, their rigorous planning entails that they already have a general notion of how the unique shots will later be pieced together in the cutting room before they start with actual filming. Accordingly, considerations of the specific shots to be employed in animations should be rooted in the planning stage and not thrown in later as an afterthought. It is our belief that their integral inclusion into the planning of the main narrative structure of the presentation encourages the generation of a richer, more engaging end product.


  1. Adam, B., Dorai, C., & Venkatesh, S. (2000a). Role of shot length in characterizing tempo and dramatic story sections in motion pictures. In Proceedings of the 1st IEEE Pacific Rim Conference on Multimedia (pp. 54-57). Sydney: University of Sydney.
  2. Adam, B., Dorai, C., & Venkatesh, S. (2000b). Study of shot length and motion as contributing factors to movie tempo. In Proceedings of the 8th ACM International Conference on Multimedia (pp. 353-355). New York: ACM.
  3. Adam, B., Dorai, C., & Venkatesh, S. (2002). Toward automatic extraction of expressive elements from motion pictures: Tempo. IEEE Transactions on Multimedia, 4(4), 472-481.
  4. Alvarado, R. G., Castillo, G. A., Márquez, J. C. P., & Mayorga, S. N. (2005). Filmic development of architectural animations. International Journal of Architectural Computing, 3(3), 299-316.
  5. Alvarado, R. G., & Isorna, J. M. (2004). The fragmented eye, cinematographic techniques for architectural animations. In Proceedings of the 22nd Conference on Education and Research in Computer Aided Architectural Design in Europe (pp. 366-373). Copenhagen: The Royal Danish Academy of Fine Arts, School of Architecture.
  6. Digital Production GmbH (2002). Animago award: 2002. [DVD]. Munich: Reed Business Information.
  7. Digital Production GmbH (2004). Animago award: 2004. [DVD]. Munich: Reed Business Information.
  8. Digital Production GmbH (2005). Animago award: 2005. [DVD]. Munich: Reed Business Information.
  9. Digital Production GmbH (2006). Animago award 2006. [DVD]. Munich: Reed Business Information.
  10. Bordwell, D., & Thompson, K. (2004). Film art: An introduction (7th ed.). New York: McGraw-Hill.
  11. Carroll, J. M., & Bever, T. G. (1976). Segmentation in cinema perception. Science, 191, 1053-1055.
  12. Hochberg, J. (1986). Representation of motion and space in video and cinematic displays. In K. R. Boff, L. Kaufmann, & J. P. Thomas (Eds.), Handbook of perception and human performance (Vol. 1, Chap. 22, pp. 22:1–22:64). New York: Wiley.
  13. Kardes, F. R. (2002). Consumer behavior and managerial decision making (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
  14. Knox, M. (2005). Design and communication of architectural space using 3D graphics and film language. In Proceedings of the International Conference on Computer Graphics and Interactive Techniques, ACM SIGGRAPH 2005 Educators Program (No. 9). New York: ACM.
  15. Kraft, R. N. (1986). The role of cutting in the evaluation and retention of film. Journal of Experimental Psychology: Learning Memory and Cognition, 12(1), 155-163.
  16. Larsen, V., Wright, N. D., & Hergert, T. R. (2004). Advertising montage: Two theoretical perspectives. Psychology & Marketing, 21(1), 1-15.
  17. May, J., Dean, M. P., & Barnard, P. J. (2003). Using film cutting techniques in interface design. Human-Computer-Interaction, 18(4), 325-372.
  18. Monaco, J. (2000). How to read a film: The world of movies, media, and multimedia (3rd ed.). New York: Oxford University Press.
  19. Ng, K., Schnabel, M. A., & Thomas, K. (2006). Architectural animation becomes alive, creating spatial narrative with spatial characters for animations. In Proceedings of the 24th Conference on Education in Computer Aided Architectural Design in Europe (pp. 598-603). Volos, Greece: University of Thessaly, School of Engineering.
  20. Tsivian, Y. (n.d.). CineMetrics.lv | Movie measurement and study tool database. Retrieved October 9, 2007, from http://cinemetrics.lv/database.php
  21. Wiedmer, M., Agotai, D., Lenzin, R., & Kempter, F. (2006). Compositing spaces – The transferring of space relevant film elements into computer-generated architecture-related animation. In Proceedings of the 24th Conference on Education in Computer Aided Architectural Design in Europe (pp. 604-607). Volos, Greece: University of Thessaly, School of Engineering.


Appendix 1. Animation questionnaire (translated version)

Appendix 2. Sample animations representative for those used in this study. All animations presented here can be accessed with the hyperlinks on Youtube.com.

Animation Screenshot Title and Link
(This Animation was not used in the study, but is representative of the material used in this study.)
Neuer Bahnhofplatz Bern 2008, Part I
(This Animation was used in the study.)
Virtual Walkthrough
(This Animation was not used in the study, but is representative of the material used in this study.)
3D Max Interior
(This Animation was not used in the study, but is representative of the material used in this study.)
Mise en scene – office culture
Animation by Medea Willimann
(This Animation was not used in the study, but is representative of the material used in this study.)