A Syntactic-Semantic Optimality Theoretic Model on Hakka Topic-Comment Construction

The purpose of this paper is to show how the basic Topic-Comment ordering pattern of the Hakka can be accounted for by the constraint-based optimality theory. Part of the linguistic data used in this paper is adopted from Xu (2002), while those examples presented to show syntactic tests are created by the author. These sentences have been further checked and confirmed by a native speaker of Hakka. This paper proposes an Optimality Theoretic (OT) model that takes into account both syntactic and semantic considerations. It shows that semantic information comes into play successively at different points of OT grammar. First, integrating semantic information into the schema of OT syntax works precisely to describe the Hakka topic-initial sentence pattern. The alignment constraints incorporate information about the semantically defined topic and comment constructions into the constraint design, which interacts with other markedness constraints to filter linguistic constructions during production. Second, semantic constraints are formed to further evaluate form-meaning pairs during the process of interpretation. In this aspect, semantic notions including contrastiveness and markedness are incorporated into the theoretical plan with the purpose of pairing syntactically well-formed sentences with appropriate meaning. The paper successfully presents an optimization model illustrating how syntax and semantics cooperate to pair meanings with linguistic constructions in forming linguistic expressions.


INTRODUCTION
Optimality Theory (OT), proposed by Prince & Smolensky (1993), is a relatively young linguistic theory compared to other theories that have a long history along the development track in the generative tradition. OT originated from the field of phonology, but later on, linguists from other fields adopted the theory to investigate a wide variety of linguistic phenomena in different grammar aspects. OT explores the grammar of languages through an input-output mapping process, and the heart of the process is a means for comparing linguistic analyses generated for a given input and selecting the one(s) that best satisfies the relevant constraints to be the output. The essence of the theory is to assess the well-formedness of the candidates and the corresponding input-output relation based on a set of hierarchically ranked constraints. There is abundant linguistic research adopting OT as the theoretical framework to explain linguistic phenomena of natural languages. However, Hakka, a language that has over 30 million speakers, is barely the subject of analysis in OT-based syntactic investigations. Therefore, in this paper, we will look into greater depth at the construction of Hakka topicalization and adopt OT as the theoretical framework to fill in the gap between Hakka and OT syntax. The organization of this paper is as follows.
The first part of the paper proposes an OT model that takes into account both syntactic and semantic considerations. It is also shown that the integration of semantic information into the schema of OT syntax works precisely to describe the structure of Hakka syntactic constructions. Three kinds of constraints, alignment, faithfulness, and markedness (structural) constraints are useful in effectively deriving the order and pattern of Hakka constructions. At the same time, semantic information comes in a series of manners. In one aspect, semantic information may form syntactic constraints that interplay with other constraints to filter linguistic constructions during production. In another aspect, semantic constraints are formed to evaluate form-meaning pairs during the process of interpretation. The second part of the paper uses Hakka syntactic configurations to explain these points. Finally, the paper is summarized by an optimization model that illustrates the formation of topicalization in Hakka. This is also modeled as the result of input-output mapping from meaning to linguistic form.

LITERATURE REVIEW
This paper adopts the bidirectional version of Optimality Theory to analyze the topic-initial construction of Hakka. The significance of the topic's role in Chinese languages has been mentioned by Chao early in 1968. According to Chao (1968), the topic-comment notion defined the relationship between subject and predicate in Chinese. This idea was further promoted by Li & Thompson (1976), who set a dichotomy between topic-prominent and subject-prominent languages, and Chinese is claimed to be a topic-prominent language in which the topic information, rather than the grammatical subject, is grammaticalized in the preverbal position. In this paper, a syntactic-semantic OT based account is proposed to investigate the topic-comment construction in Hakka. A brief literature review on the traditional and the bidirectional version of the theory is provided in this section.
This paper contributes to filling in the gap between Hakka syntax and OT by examining the basic sentence patterns of the Hakka with a modern theory that is established upon the notion of constraint interaction and input-output optimization. While OT has been successfully adopted to different linguistic domains to deal with a wide array of structural patterns observed in language users' linguistic performance, studies in Hakka applied the theory mainly to explaining the phonological features such as tone sandhi and syllable structures (Chen, 2000;Hsiao, 2015;Hsu, 2005;Lin, 2005Lin, , 2011Tung 2011). There are relatively very few studies adopting the theory in the discipline of syntax or syntax-semantics interface. Some previous studies on syntactic OT can be found in Tseng's original works on Hakka syntactic constructions, including Hakka relative clauses (Tseng, 2011), prepositional phrases (Tseng, 2012), and nominal constructions (Tseng, 2020).
The basic architecture of OT is as follows. Given an input, OT grammar generates candidates that compete with one another based on a set of hierarchically ranked constraints. These constraints are violable, and the candidate that violates the lowest-ranking constraint is the optimal candidate compared to other candidates that violate at least one higher-ranking constraint. The optimal candidate is the one that 'minimally' violates the constraints. It is selected to be the output, while all the other candidates are ruled out by incurring a more serious violation of the constraints in the OT plan, and the violation is 'fatal'.

Bidirectional Optimality Theory
OT begins as a phonological theory. Triggered by the attempt for linguists to more generally employ the theory in different aspects of grammar, including sounds, words, sentences, and meaning, a bidirectional version of the optimality theory (BiOT) emerges. Hence, it is argued to be an integrated approach that combines areas such as phonology, morphology, syntax, semantics, language acquisition, and cognitive science to cope with linguistic problems in a more precise way (Beaver & Lee, 2004;Blutner, 2000;Hendriks et al., 2010;Huddlestone & de Swart, 2014;Jäger 2004;Wilson 2001;Zeevat 2001).
BiOT evaluates grammar from both the speaker's and the hearer's perspectives. The original idea was proposed by Blutner (2000), arguing that the merit of BiOT was to combine generative and interpretational optimization process when applying OT to evaluate the structure of natural languages. The process of optimization proceeds along two directions aiming to reconcile production with comprehension because language is an interplay of the two perspectives, and communication must take place between encoders and decoders. The BiOT considers potential linguistic forms for the representation of a specific meaning; besides, the theory also associates meanings with expressions to form multiple form-meaning pairs and evaluates them as one for selecting the corresponding optimal relationship.

METHOD
This paper establishes a BiOT model to investigate the Hakka topic-comment construction. The Hakka data adopted in this paper is partially elicited from the book 'Hakka Little Prince' by Xu Zhao Quan in 2002. Some of the sentences listed in this paper are created by the author for the purpose of performing syntactic tests. These sentences have been checked and confirmed by the author's language consultant, a 70year-old native Hai-Lu Hakka speaker from Miao-Li, Taiwan.
In the following part of this section, the author builds a BiOT model that integrates syntax and semantics to account for the Hakka topic-comment construction.

An Interplay of Production and Interpretation
In the sense of BiOT, grammar is a bidirectional optimization process that combines evaluations from the perspectives of production and interpretation. This paper proposes Figure 1 to show that the speaker uses bidirectional optimization during language production. The speaker restricts his/her optimal productions by further inspecting if he/she can detect the meaning based on the alternative ways he/she can understand the utterance. For example, a Hakka speaker wants to express the meaning 'a person'. He can produce multiple linguistic forms that conform to the designated meaning, including both yit ge ngin, and yit sa ngin. The speaker then tries to select an appropriate expression by checking how he/she understands the difference between the two. While yit sa ngin shows politeness and respect, ye ge ngin is more widely used with a casual connotation. In this case, both of the linguistic patterns are syntactically well-formed, semantic information is employed to select the one appropriate in a given context.  The diagram in Figure 1 shows how syntax and semantics cooperate to produce linguistic expressions. The process can operate when syntactic OT first filters optimal linguistic forms, and then semantic OT takes action bringing concerns from the interpretation's end to pair the optimal forms with alternative meanings. An appropriate grammatical utterance is then made.

An Interplay of Syntax and Semantics
In the sense of BiOT, grammar incorporates syntax with semantics to relate meanings to forms and then distinguish forms into different meanings. This paper explains the idea by presenting Figure 2, showing that the incorporation of syntax with semantics involves two rounds of the process. Diagram (a) shows the first round of syntactic OT evaluation in which a meaning input may generate multiple output forms. Diagram (b) shows the second round of semantic OT evaluation that relates the forms generated from Diagram (a) to a few different but related meanings.

Building a BiOT Model for Hakka
In this paper, we propose the following BiOT model in Figure 3 to show how semantic information cooperates with syntax to pair meanings with linguistic constructions that are derived from the process of syntactic optimization, which takes f-structure as an input for the analysis. F-structure stands for feature/function structure. The concept is derived from lexical functional grammar (LFG). F-structure is a syntactic representation of grammatical functions such as subject, object, tense, aspect, number, person, etc.
The diagram in Figure 3 shows that the evaluation process has been divided into two parts. The first part describes the process of syntactic OT analysis. Taking an fstructure as the input for analysis, OT generates a set of candidates to be evaluated by syntactic constraints. These candidates compete with each other and the one(s) that best satisfy the constraint hierarchy is selected as the output. If the first round of OT generates multiple outputs corresponding to the same f-structure, the second round of OT is activated as a mechanism that pairs different linguistic forms with a few different but related meanings. We see that the second part of the process applies semantic OT analysis. The underlying concept is that no two linguistic forms should have precisely the same meaning. In this model, semantics plays a role in different parts of the analysis. As shown in Figure 3, semantic information is provided to form syntactic constraints that filter linguistic constructions in the first part of the analysis. In addition to that, semantic constraints also evaluate form and meaning as a pair and select the optimal one out of the various form-meaning combinations.

RESULTS AND DISCUSSION
This section shows the results of using the Hakka topic-initial ordering pattern to illustrate the model presented in the earlier section. The aim of the analysis is to show how syntax and semantics interplay toward each other to account for Hakka structures. Section 4.1 focuses on the first part of the analysis and section 4.2 spots the second part.

Output/Input
Syntactic constructions that best satisfy the OT evaluation.
Semantics is incorporated into the constraints.

Semantic Constraints
Apply semantic OT to map grammatical forms with potential meanings.

Output
Form-meaning pairs are derived.

Syntactic OT Round
In OT, three types of constraints are frequently used to generate grammatical constructions. First, Generalized Alignment (GA) constraints (McCarthy & Prince 1993a, 1993b are positional constraints that require the edge of some linguistic constituent to coincide with the edge of an individual designated domain. Knowing that Chinese languages, including Hakka, are often argued to be classified into the category of a 'topic prominent' language (Chao, 1968;Chen & Yeh, 2007;Li & Thompson, 1976) wherein the topic of a sentence is generated in the sentence-initial position, we can use GA constraints to derive the word order. Some examples are given in (1). This paper uses the following abbreviations in the gloss: CL 'classifier'; MOD 'modifier'; PAT 'patient marker'; PERF 'perfective aspectual marker'; RVC 'resultative verbal compound'.
(  (Xu, 2000, p. 129) Generally, a sentential topic states the 'aboutness' of a sentence. It newly introduces a referent as to what the comment of this sentence is about (Reinhart 1981). We can test the sentences in (1) according to Reinhart's (1981) solution, who proposes that the topic of a sentence is item X in the answer to the request tell me about X.
(2 The topic of a sentence can be phrases of different syntactic categories. For example, a prepositional phrase in (1a), a clause in (1b), and a noun phrase in (1c) can all be the topic. As shown in (1), the topic of a sentence stands at the sentence-initial position. Therefore, the GA constraints account for the ordering pattern by proposing the following two manifestations of constraints. Tableau 1 illustrates the constraint ranking by showing that the positional constraint that aligns the topic to the left edge of the sentence should outrank the constraint that aligns the remainder of the sentence to the left edge. S stands for 'sentence'. In this paper, we do not specifically propose TP (Tense Phrase) as the scope for sentential constituents because all the tense and aspect features should be marked in the f-structure of verbs. FTopic-Comment * Comment-Topic *! Comment-Topic-Comment *! The asterisk symbol (*) means 'constraint violation'; the exclamation mark (!) means 'the violation is fatal'. The finger symbol (F) indicates 'the optimal output'.
In addition to the GA constraint, the second kind of constraint, markedness constraint, investigates the structural well-formedness of output candidates. Faithfulness constraint is the third kind of constraint, examining the correspondence of the input and output and requiring the identity between them. The interaction of the two kinds of constraints illustrates the grammar of languages and accounts for a wide range of linguistic phenomena.
It has been argued that even though the idea of 'topic' exists in both Chinese and English, the two languages show different syntactic behavior regarding the constructions involving a sentence-initial topic. As indicated by Chafe (1987), the Chinese have a unique topic construction in which the sentence-initial topic is formed through base generation. A comparison of the topic construction in English and Hakka is shown in (6) and (7). The English examples in (6) show that when the object is topicalized, a resumptive pronoun may appear in the base position of the object, and it must co-index with the topicalized object; otherwise, the sentence becomes ungrammatical as (6c). By contrast, the Chinese sentence in (7c) shows that in the same situation, the topic and the object do not necessarily denote the same reference, while the sentence remains grammatical. According to Chafe (1976), this topic describes the aboutness of a sentence. It is unique in Chinese languages and is called Chinese style topic construction.
OT accounts for the topic construction by proposing a faithfulness constraint INCLUSIVENESS against insertion (i.e., a kind of DEP constraint, constraints that require input and output dependence) (Legendre et al., 1998;Salzmann, 2006). The faithfulness constraint interacts with the markedness constraint for theta role assignment (θ-ASSIGNMENT) to prevent the argument of a verb from not being ccommanded by the head verb (Müller 2009). The constraint is also proposed by other linguists such as Grimshaw (1997) and Kager (1999) as ECONOMY that requires an economic linguistic expression against the movement. Another markedness constraint, ARGUMENT 2 , is proposed to prohibit the repeated occurrence of the same argument. This constraint encourages the substitution of a proform for the argument that has already been referred to the second time in the sentence. This constraint manifests the famous OCP (Obligatory Contour Principle) effects (Goldsmith, 1976;Leben, 1973;McCarthy, 1981, 1986, Tseng, 2008, describing the linguistic phenomenon in which identical linguistic elements are disallowed to appear repetitively. These constraints are defined in the following (8). An argument of V does not occur repetitively in one sentence.

The constraint interaction is shown in Tableau 2. The analysis can account for both the English examples in (6) and the Hakka examples (7a) and (7b).
The alignment constraint filters out the first candidate due to the non-initial position of the topic information. The second and third candidates are equally harmful as each of them violates one of the constraints in (8). The second candidate violates θ-ASSIGN because the object NP has been moved leftward outside the domain of the original VP. The third candidate violates INCL due to the insertion of a third person resumptive pronoun, incurring a violation of the I-O faithfulness constraint. The last candidate also violates INCL because a full noun phrase nuiungiuk mien 'beef noodle' has been inserted into the object position; besides, it collects one more violation of ARGUMENT 2 due to the repetitive occurrence of this NP. As for (7c), presenting the so-called "Chinese style topic construction" proposed by Chafe (1976), we argue that it is derived from a different f-structure, which is illustrated by the side-by-side contrast presented in (9).

(9)
The OT evaluation for (9b) is shown in Tableau 3. A fatal violation is incurred if the object NP is not present or if it is replaced by a pronoun. As shown in Tableau 3, since the topic is no longer based-generated on the object position, neither a leftward movement nor an insertion of a resumptive pronoun is necessary. At this stage, semantic information steps in to form OT constraints. This can be testified when we are selecting an appropriate pronoun to substitute for the fronted topic. Pronouns are commonly found to agree with their antecedent in certain semantic features. Different languages may require different kinds of pronoun concords. An English pronoun must agree with its antecedent in person, number, and gender; contrastively, a Chinese pronoun shows correspondence with its antecedent only in the person and number feature. The pronoun concord of Hakka is shown in (10). Knowing that the topic NP ngiungiuk mien is associated with the feature singular and third person (inanimate), the semantic faithfulness constraints are proposed in (11) that require correspondence in feature agreement between pronouns and their antecedent.

(11) FAITH-SEMF (PERSON):
A semantic agreement must be reached between the pronoun and its antecedent in the feature [person].

FAITH-SEMF (NUMBER):
A semantic agreement must be reached between the pronoun and its antecedent in the feature [number].
Tableau 4 illustrates that the two semantic constraints can account for the pronoun selection for the Hakka case. When the topic NP is associated with the features of [singular] and [third person (inanimate)], the resumptive pronoun must bear the same semantic features, so that the two correspondence faithfulness constraints can be satisfied, and the semantic agreement on person and number between the pronoun and its antecedent is reached.

Semantic OT Round
Each linguistic form should have its own unique meaning. When syntactic OT selects two grammatical outputs for a given f-structure, we can predict that the semantic meaning of the two optimal outputs is slightly different. Therefore, the second round of semantics-based OT analysis is activated, and which is described to be a form-meaning pairing process.
The following (12) relisted the previous examples (7a) and (7b), the two optimal outputs generated from Tableau 4.  (Giusti, 2006;Molnár, 2002;Neeleman et al., 2009). A contrastive topic is about the alternatives of an expression (Tomioka, 2010). It implies a negation of at least one alternative relevant to the topic, and it may or may not be mentioned in the context. The contrastive and non-contrastive distinction can be explained by creating a context that requires the use of a contrastive topic construction, as shown in (13). The question mark (?) means the sentence is syntactically well-formed, but it is not semantically appropriate in the given context. It should be noted here that the B2 in (13) and in (14) are syntactically wellformed, as they both are optimal outputs selected in the first round of OT evaluation. In (13) and (14), a syntactic context that indicates a set of two alternatives forces an appropriate response to be the one that can be used to express the contrastive meaning. In this case, both B1 instances are better than B2. The two B1 sentences are constructions that not only indicate the topic of the sentence but also imply a negation of the other alternative based on the reference given by the comment. In (13) and (14), since the A's dialogue has pointed out the two individuals' existence as alternatives, both B1 and B2 selected one alternative out of the alternatives set and proposed a comment to defend the selection. The B1 in (13) implies that the thing he has devoured is "beef noodle," rather than another bowl of "pork noodle" indicated in A. The B1 in (14) implies that the person nobody likes is "his brother", but his sister does not have the same problem. As to the two B2 instances, they are merely descriptive. The linguistic form means what is produced with no further implication detectable. Therefore, it is argued that B1 is more appropriate than B2 in this given context because it is B1, rather than B2, that contains an implication that the unselected alternative is excluded from the semantic domain of the comment.
In this paper, we argue that a topic is considered marked when it coincides with a non-contrastive interpretation, in contrast to an unmarked contrastive interpretation. Topic does not necessarily refer to old information, but it has a strong correlation with information that is known or has previously been mentioned in the (con)text (de Swart & de Hoop, 1995). Furthermore, the topic indicates what the sentence is about; therefore, when topicalization occurs, the speaker has to resort to the previous context or the common ground to grammatically mark a piece of information as the topic of the following sentence. In that case, the speaker has to contrast the selected information with other information provided in the context and single it out by assigning to it some sort of pragmatic prominence as the topic of the next utterance. Therefore, we argue that "contrastive meaning" resides in the basic meaning of a sentential topic. Contrastiveness is correlated with the unmarked meaning encoded in the topic information, while non-contrastiveness is relatively marked.
The first semantic OT evaluation proposes the following semantic constraints to describe this phenomenon, as shown in (15).
(15) *TOPIC=CONTRASTIVE: A topic should not be coded with contrastive meaning. *TOPIC=NONCONTRASTIVE: A topic should not be coded with non-contrastive meaning.
The OT evaluation presented in Tableau 5 illustrates the point mentioned in this section. The topic of a sentence is naturally compatible with a contrastive interpretation. The topic information often refers to old information, and the speaker has to contrast the selected information with other known information in a given context and then specifically topicalize it. As shown in Tableau 5, the constraint interaction shows that the semantic constraint arguing against a topic associated with a non-contrastive meaning overrides the constraint against a topic that contains a contrastive meaning. Now we should start pairing the two syntactically well-formed constructions (12a) and (12b) (i.e., f1 and f2) derived from the first round of syntactic OT analysis to the two alternative contrastive and non-contrastive interpretations. This paper proposes that the meaning contrast between an empty object position (12a=f1) and a resumptive pronoun (12b=f2) is triggered by the 'quantity principle' indicated by Givón (1991) in his explanation of the principle of iconicity. According to Givón (1991), less predictable information tends to be encoded with more coding material. To paraphrase this point to conform to the Hakka case, more predictable meaning, more inclined to be naturally perceived by the language users as a common interpretation, is considered unmarked and tends to be more economical in linguistic structure. In contrast, marked meanings are less predictable interpretations, and they tend to be more iconic and explicit linguistic structures.
Therefore, the semantic OT proposes the following constraints in (16), and the relevant constraint interaction is illustrated in Tableau 6.
(16) *MARKED=ECONOMIC: Marked meaning should not be more concise than its unmarked counterpart. *MARKED=ICONIC: Marked meaning should not be more explicit (faithful to the input) than its unmarked counterpart.
Tableau 6. OT pairing of (12a) and (12b) with contrastiveness. According to the result of the previous Tableau 5, m1 is the unmarked meaning while m2 is the marked meaning. Therefore, in Tableau 6, the form-meaning pair that involves an association of a marked non-contrastive meaning with (12b) where a resumptive pronoun is present and an unmarked contrastive meaning with (12a) where the object position is empty, fulfills the higher-ranking semantic constraint that encourages the link of a marked meaning with the more explicit form and an unmarked meaning with the more economical form. It is, therefore, the optimal pair.

Syntax-Semantics BiOT Model for Hakka Topic Construction
To account for the Topic-Comment word order of the Hakka language, two rounds of OT analysis were employed. The first round of OT evaluated the linguistic forms generated for expressing a specific meaning based on the interaction of a set of hierarchically ordered constraints, including Generalized Alignment, Faithfulness, and Markedness constraints. If multiple outputs are derived from the first round of evaluation, the second round of OT evaluation is employed, which evaluates the optimal outputs with possible alternative meanings. The second round was semantically based, and it distinguishes meanings by pairing linguistic forms with the more harmonic meaning. The overall process is summarized in Figure 4. According to Figure 4, the first round of OT generates two syntactically wellformed outputs for the topic construction through an optimization process that evaluates linguistic forms based on a set of hierarchically ranked constraints. One of the outputs contains a resumptive pronoun in the base position of the topic information; another output leaves a gap in the base position. The two optimal outputs were sent to the second round of OT evaluation. Semantic constraints are adopted to form harmonic form-meaning pairs in two steps. First, the optimization process pairs the topic construction with an unmarked contrastive meaning. Second, the process pairs the more economical output with the unmarked contrastive meaning and the more iconic output with the marked non-contrastive meaning.

CONCLUSION
Syntactic analysis of Hakka structures formulated on the basis of BiOT is scarce. Tseng (2011) has proposed a BiOT analysis to account for Hakka's relative construction and the nominal constructions involving the functional morpheme GAI (Tseng, 2020). The model proposed in this paper has the capacity to capture the theoretical analysis developed by Tseng in her works (2011,2020). According to her, restrictive and nonrestrictive relative clauses in Hakka are distinguished by a cyclic application of OT analysis. The first round of OT selects two linguistic patterns for the expression of relative clauses. In the second round, semantic constraints are employed to distinguish nonrestrictive from restrictive relative clauses and pair the syntactic patterns selected in the first round of evaluation with the meanings associated with two different kinds of relative clauses. In her recent publication about the structure of NP constructions involving a head noun modified by a phrasal modifier, the presence or absence of the modificational morpheme GAI is illustrated with a bidirectional version of OT analysis. The syntactic OT uses some hierarchically ordered linguistic constraints to generate well-formed syntactic patterns that may or may not contain a functional head GAI. If the presence of GAI is optional, two syntactic outputs are derived, and the semantic OT is employed to pair the optimal outputs with two similar but still distinguishable meanings.
The current research enriches linguistic studies conducted along the OT track and argues for a stronger adoptability of BiOT as the theoretical framework accounting for different aspects of Hakka syntax. The researcher will continue to apply this theory to investigate other parts of Hakka's syntactic structures. In addition, the current syntax-semantic model can potentially be employed to describe syntactic phenomena found in other languages. By establishing cross-linguistic variation in constraint ranking, which is one of the cornerstones of Optimality Theory in general, the BiOT model gains stronger explanatory power.