The Amino Acid Sequence of Rat Liver Glucokinase Deduced from Cloned cdna*

Download (0)

Full text

(1)

THE JOURNAL OF BIOLOGICAL CHEMISTRY

Q 1989 by The American Society for Biochemistry and Molecular Biology, Inc. Vol. 264, No. 1, Issue of January 5 , pp. 363-369,1989 Printed in U.S.A.

The Amino Acid Sequence of Rat Liver Glucokinase Deduced from

Cloned cDNA*

(Received for publication, May 18, 1988)

Teresa L. Andreone$, Richard L. Printz, Simon J. Pilkiss, Mark A. Magnusod, and

Daryl K. Granner

(1

From the Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee

37232 and the $Department of Physiology and Biophysics, State University of New York, Stony Brook, New York 11794

Rat liver glucokinase (ATPm-hexose 6-phospho-

transferase, EC 2.7.1.1) was purified to homogeneity,

cleaved, and subjected to amino acid sequence analysis. Forty-five percent of the protein sequence was ob-

tained, and this information was used to design oligo-

nucleotide probes to screen a rat liver cDNA library.

A 1601-base pair cDNA (GK1) contained an open read-

ing frame that encoded the amino acid sequences found

in the peptides used to generate the oligonucleotide

probes. A second cDNA was subsequently identified

(GK.Z%), which is 2346 base pairs long and corre-

sponds to nearly the entire glucokinase mRNA. Blot transfer analysis of hepatic RNA showed that gluco-

kinase mRNA exists as a single species of about 2400

nucleotides. Four hours of insulin treatment of diabetic

rats resulted in a 30-fold induction of this mRNA.

GK.Z2 has a long open reading frame which, with

the known partial peptide sequence, allowed us to de-

duce the primary structure of glucokinase. The enzyme

is composed of 465 amino acids and has a mass of

51,924 daltons. Glucokinase has 53 and 33% amino acid sequence identities with the carboxyl-terminal

domains of rat brain hexokinase I and yeast hexoki-

nase, respectively. If conservative amino acid replace-

ments are also considered, glucokinase is similar to

these two enzymes at 75 and 63% of positions, respec-

tively. The putative glucose- and ATP-binding do-

mains of glucokinase were identified, and these regions

appear to be highly conserved in the hexokinase family

of enzymes.

Glucokinase (ATP:D-hexose 6-phosphotransferase, EC 2.7.1.1), because it is expressed only in liver and pancreatic P-cells, is a unique member of the family of mammalian * This work was supported in part by National Institutes of Health Grants DK 35107 (to D. K. G.) and DK 38354 (to S. J. P.), National Science Foundation Grant DMB 8608989 (to S. J. P.), Vanderbilt Diabetes Research and Training Center Grant DK 07061, and grants from the Juvenile Diabetes Foundation (to D. K. G. and M. A. M.)

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “aduertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The nucleotide sequence(s) reported in thispaper has been submitted

50421 8.

to the GenBankTM/EMBL Data Bank with accession number(s)

$ Postdoctoral Fellow of the Juvenile Diabetes Foundation sup- ported by Grant 386189.

ll Recipient of a Research Career Development award from the American Diabetes Association.

1) To whom correspondence should be addressed Dept. of Molec- ular Physiology and Biophysics, 607 Light Hall, Vanderbilt Univer- sity, Nashville, TN 37232.

hexokinases (1). Glucokinase plays a key role in glucose

homeostasis. Because of its high K,,, (5 mM) and specificity

for glucose and the lack of product inhibition by glucose 6- phosphate, it ensures a gradient for glucose entry into the

hepatocyte (2). This is particularly important in the post-

prandial state when plasma glucose levels are elevated. Pan- creatic &cell glucokinase is probably equally important be- cause flux through the glycolytic pathway is thought to be involved in glucose-mediated insulin release (3).

Hepatic glucokinase, like other critical metabolic enzymes, is regulated by a variety of dietary and hormonal factors, all of which appear to alter the rate of synthesis of the protein

(4). Glucokinase is increased after feeding (particularly a

carbohydrate-rich diet) and is decreased by fasting (5). These changes are probably mediated by insulin and glucagon. In- sulin, whose levels are increased in the post-prandial state, presumably accounts for the feeding-induced change in glu- cokinase since it also restores to normal the low glucokinase activity found in diabetic animals (2). Glucagon, elevated in

fasting animals, has exactly opposite effects (2). The actions

of glucagon appear to be mediated by CAMP, which itself

decreases glucokinase when injected into animals (6). Changes

of the rate of synthesis of glucokinase have been correlated with corresponding alterations of mRNAGK (6-9) and with rates of transcription of the glucokinase gene (9).

The other hexokinases found in liver (types 1-111; glucoki-

nase is sometimes referred to as type IV) have low K , values

for glucose (-20-130 PM), catalyze the phosphorylation of a

variety of hexoses, are subject to feedback inhibition by glu-

cose 6-phosphate, and are not regulated by insulin or glucagon

(1,2). These hexokinases have a molecular mass of -100 kDa, whereas glucokinase and yeast hexokinases PI and PI1 (like glucokinase, not subject to feedback inhibition by glucose 6- phosphate) have a molecular mass of -50 kDa. All are pos- tulated to be members of a family in which the larger mam- malian enzymes arose from a duplication and fusion event

from an ancestral gene similar to the yeast or glucokinase

genes. This would account for the suspected similarity in

amino acid sequence (10) and in the antigenic cross-reactivity

that has been demonstrated between yeast and mammalian

hexokinases (11). These features, coupled with the relative

scarcity of glucokinase in rat liver, have heretofore precluded efforts to purify glucokinase in amounts necessary for detailed analysis. It therefore has not been possible to deduce the structures of the glucokinase gene, mRNA, and protein.

The strategy we employed to obtain a cDNAGK involved The abbreviations used are: mRNAGK, messenger RNA coding for glucokinase; cDNAGK, DNA complementary to mRNAGK; bp, base pair(s); kb, kilobase pair(s); EGTA, [ethylenebis(oxyethylenenitrilo)] tetraacetic acid; DTT, dithiothreitol; HPLC, high pressure liquid chromatography; BSA, bovine serum albumin; NP-40, Nonidet P-40.

363

(2)

364

first purifying glucokinase, establishing its authenticity by its known kinetic parameters, and sequencing several peptides. Knowledge of the partial amino acid sequence allowed us to construct oligonucleotide probes. Because of the possibility that any given peptide could be similar to a region in one of the other hexokinases, these synthetic probes were used to

screen cDNA libraries only after they had been used to

demonstrate the expected regulation of mRNAGK by insulin,

diabetes, fasting, or refeeding. This approach led to the iso-

lation of a series of authentic cDNAGK molecules from which

we were able to deduce the amino acid sequence of glucoki- nase. This information allowed us to compare the overall structure of glucokinase to other members of the hexokinase family and to draw certain inferences about the nature of the glucose- and ATP-binding sites in the molecule.

EXPERIMENTAL PROCEDURES~ RESULTS

Purification and Characterization of Glucokinase-Rat liver

glucokinase was purified to homogeneity (Fig. 1) by the pro- cedure outlined in the Miniprint. Our modifications to the protocols previously employed included a 5% polyethylene glycol precipitation step to remove contaminating protein before the column steps; batchwise binding and elution from DEAE-cellulose; the use of blue Sepharose in two different buffers, first to remove hydrophobic proteins and then to remove nucleotide-binding proteins; and one to two passes over an analytical high pressure liquid chromatography resin. These changes resulted in a procedure that yielded sufficient glucokinase to allow for the first determination of the partial amino acid sequence of this protein. The 21,000-fold purifi- cation required suggests that glucokinase accounts for ap- proximately 0.005% of rat liver protein (Table I). The purified protein had a molecular mass of -50 kDa, showed selective

phosphorylation of glucose with a

K,

of 5.0 mM, and had a

specific activity of 180 units/mg. These parameters matched

those found by other investigators (Tables I1 and 111) (1,2, 5,

12, 13, 21-27). The combination of a high

K,

for glucose and

the lack of phosphorylation of N-acetylglucosamine excludes the possibility that the 50-kDa protein purified was N-acetyl- glucosamine kinase, an enzyme that can bind to the N-(6-

aminohexanoyl)-2-amino-2-D-glucopyranose-Sepharose

affinity resin and copurify with glucokinase (28).

Design of Oligonucleotide Probes and cDNA Cloning of Glu- cokinase-Purified glucokinase was digested with cyanogen

bromide, protease V8, or endoprotease Lys C to generate

peptides for amino acid sequence analysis. Forty-five percent of the amino acid sequence of glucokinase was obtained by analyzing 33 peptides. Several different oligonucleotide probes, ranging in size from 23 to 65 nucleotides, were de- signed from the peptide sequences by utilizing a combination of preferred codon usage and mixed oligonucleotide methods. Inosine was included in some of the probes at nucleotide positions where degeneracy of the code did not clearly favor a particular nucleotide (29). These probes were used to analyze hepatic poly(A)+ RNA isolated from diabetic rats or diabetic rats treated with insulin. Four of these oligonucleotides, CNB- 3B, CNB-2B, CNB-lB, and V8-5B-5 (Table IV), hybridized

to an -2.4-kb mRNA in RNA isolated from the liver of

diabetic rats treated with insulin (Fig. 2). None of the probes

Portions of this paper (including “Experimental Procedures,” Figs. 1 and 2, and Tables I-IV) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press.

hybridized to mRNA isolated from the untreated diabetic rats. The 2.4-kb size of the mRNA is too small to encode the 100-kDa hexokinases, and mRNAs coding for these enzymes

would not be expected to respond to insulin treatment (2).

These results suggested that the oligonucleotides were specif- ically hybridizing to mRNAGK.

Mixed oligonucleotide CNB-3B gave the strongest hybrid- ization signal with mRNAGK, and it was also used to show

that the mRNA was appropriately regulated by insulin. There-

fore, CNB-3B was chosen to probe a cDNA library made with RNA obtained from normal rat liver. One-million plaques were screened, and three clones were obtained. These clones were rescreened with two other mixed oligonucleotide probes, CNB-1B and CNB-2B, during plaque purification. One of the clones, XGK1, hybridized to all three oligonucleotide probes, so the nucleic acid sequence of its cDNA insert (GK1) was

determined. It is 1601 bp long, and it encodes an open reading

frame of 263 amino acids, which contained the sequences of

five peptides found in purified glucokinase (Figs. 3 and 4).

The cDNA isolated is therefore complementary to mRNAGK.

Regulation of mRNAGK by Insulin-Four hours after the

injection of insulin into diabetic rats, mRNAGK was increased

about 30-fold, as quantitated by hybridization using a GK1

probe. The induction by insulin was specific for mRNAGK, as

no change was seen when a similar blot was probed with a calmodulin cDNA (Fig. 5 ) (30). Furthermore, another blot of the same RNA sample probed with a phosphoenolpyruvate carboxykinase cDNA showed the inhibition expected as the result of insulin treatment (Fig. 5 ) (19).

Isolation of Complete Coding Sequence for Glucokinase-

Comparison of the length of GK1 with that of mRNAGK (1601

versus 2400 bases; Figs. 2 and 5) and comparison of the length

of the open reading frame in GK1 with the appropriate

number of amino acids required to code for a protein of 50

kDa (263 uers’sus -475) indicated that GK1 did not include the entire coding sequence of glucokinase. Furthermore, GK1 did not contain either a polyadenylation signal or a poly(A) tract, indications that the 3’-region of the cDNA was incom- plete. In order to obtain a longer clone, a new cDNA library was constructed using poly(A)+ RNA isolated from diabetic rats treated with insulin for 4 h. Genomic DNA probes,

7 2

tl

!?

z

2 2

2 ;

Zlr

s

2 2

::

W E

d B ” 5, C(’lf

I

1 1

A, 3‘ ” “

-

.” -c__. GK.22

-

-

t_o

- -

-

I

-

GKI

.

!

” ” ” ”

-

I I I 1 I 1 0 500 Kxxl Is00 Zoo0 2500 BASE PAIRS

cDNA. FIG. 3. The two overlapping glucokinase cDNAs, GK1 and GK.Z2, DNA sequencing strategy for rat liver glucokinase are shown schematically. The nucleotide sequence was derived from the ends of the cDNAs with mp18 or mp19 primers (h), from Sau3A-EcoRI or SphI-EcoRI restriction fragments with mp18 or mp19 primers (+), or from synthetic oligonucleotide primers

(3)

Rat Liver Glucokinase cDNA

365

CCTTGGCTGCCTATCTTTTGCAACACTCAGCCAGACAGTCCTCACCTGCAACAGGTGGCCTCAGGAGTCAGGAACATCTCTACTTCCCCAACGACCCCTGGGTTGTCCTCTCAGAG 116 Met Ala Met A s p T h r Thr Arg Cys G l y Ala G l n Leu Leu T h r Leu Val Glu Gln Ile Leu Ala Glu Phe G l n Leu Gln Glu G l u Asp Leu 30 ATG G C T ATG G A T A C T A C A AGG TGT G O A G C C CAG TTG TTG ACT CTG G T C GAG CAG ATC CTG G C A GAG T T C CAG CTG CAG GAG G A A G A C CTG 206

V0-3

Lys Lys Val M e t S e r Arq Met Gln Lys G l u Met Asp Arq Gly Leu Arq Leu Glu Thr H I S G l u Glu Ala S e r Val Lys Met Leu PC0 Thr 6 0 AAG AAG G T G ATG A G C CGG ATG CAG AAG GAG ATG G A C C G T G G C CTG AGG CTG GAG A C C CAC GAG G A G GCC ACT G T A AAG ATG T T A CCC ACC 296

VO-4

VO-19

T y r Val Arq S e r T h r P r o G l u Gly S e r G l u V a l Gly Asp Phe Leu Ser Leu Asp Leu Gly Gly Thr Asn Phe Acg Val Met Leu Val Lys 9 0

T A C GTG C G T T C C A C C C C A G A A G G C T C A G A A G T C G G A G A C T T T C T C T C C T T A G A C CTG G G A G G A ACC AAC T T C AGA GTG ATG CTG G T C AAA 386 86-21

VO-10

V a l G l y G l u G l y G l u A l a G l y G l n T r p S e r Val Lys Thr Lys H i s Gln Met Tyr Ser Ile P r o Glu Asp Ala Met Thr Gly Thr Ala Glu 120 G T G G G A G A G G G G G A G G C A G G G C A G T G G A G C G T G A A G A C A A A A C A C C A G A T G T A C T C C A T C C C C G A G G A C G C C A T G A C G G G C A C T G C C G A G 476 M e t L e u P h e A s p T y r I l e s e r G l u c y s I l e S e r A s p P h e L e u A s p L y s V0-6 CBB-4 H i s Gln Met Lys His Lys Lys Leu Pro Leu Gly Phe Thr Phe 150 A T G C T C T T T G A C T A C A T C T C T G A A T G C A T C T C T G A C T T C C T T G A C A A G C A T C A G A T G A A G C A C A A G AAA C T G C C C C T G G G C T T C A C C T T C 566 S e r Phe P r o V a l Arg H i s G l u Asp Leu A s p Lys Gly Ile Leu Leu A5n T r p Thr Lys G l y Phe Lys Ala Ser Gly Ala G l u Gly Asn Asn 180 T C C T T C C C T G T G A G G C A C G A A GAC CTA G A C AAG G G C A T C CTC C T C A A T TGG ACC AAG G G C T T C AAG G C C TCT G G A G C A G A A GGG A A C AAC 656

vu-5

VO-15

A T C G T A G G A C T T C T C C G A G A T GCT A T C AAG AGG A G A G G G G A C T T T GAG ATG G A T GTG GTG G C A ATG GTG AAC G A C A C A GTG G C C A C A ATG 146 I l e V a l G l y L e u L e u A r q A s p Ala Ile Lys Arq A r g G l y A s p Phe G l u Met Asp Val Val Ala Met Val Asn Asp Thr Val Ala Thr Met 210

t p G K 1 ends

Ile S e r Cys T y r T y r G l u A s p A r g G l n Cys G l u Val Gly Met Ile Val Gly Thr Gly Cys Asn Ala Cys Tyr Met G l u G l u Met Gln Asn 240

A T C T C C T G C T A C T A T G A A G A C CGC C A A T G T GAG G T C G G C ATG A T T GTG G G C A C T G G C T G C A A T G C C T G C T A C ATG GAG G A A ATG CAG AAT 836 V8-8

Val G l u Leu Val G l u G l y A s p G l u Gly A r g Met cys Val A s n T h c G l u T r p Gly Ala Phe Gly Asp Ser Gly G l u Leu Asp G l u Phe Leu 270 G T G G A G CTG G T G G A A G G G G A T GAG G G A C G C ATG T G C G T C A A C ACG GAG TGG G G C G C C T T C GGG G A C TCG G G C GAG CTG GAT GAG T T C CTA 926

VO-11

Leu G l u Tyr A s p Arg Met Val Asp Glu Ser Ser Ala Asn Pro Gly Gln G l n Leu Tyr Glu Lys I l e Ile Gly Gly Lys Tyr Met Gly Glu 300 CTG G A G T A T G A C CGG ATG GTG GAT G A A A G C T C A GCG A A C C C C G G T CAG CAG CTG T A C GAG AAG ATC ATC G G T GGG AAG T A T ATG GGC GAG 1016

CBB-3

Leu V a l Arq L e u Val Leu Leu Lys L e u Val A s p G l u A s n Leu Leu Phe H i s Gly Glu Ala Ser G l u Gln Leu Arg Thr Arq Gly Ala Phe 330 CTG G T A C G A C T T G T G CTG CTT AAG CTG G T G G A C GAG A A C CTT CTG T T C CAC G G A GAG G C C T C G GAG CAG CTG CGC ACG CGT G G T G C T T T T 1106 G l u T h r Acq Phe V a l S e r G l n Val G l u S e r A s p S e r Gly Asp Arq Lys G l n Ile His Asn Ile Leu Ser Thr Leu Gly Leu Arg P r o Ser 360 G A G A C C C G T T T C G T G T C A C A A GTG GAG A G C G A C TCC GGG G A C C G A AAG CAG ATC CAC A A C A T C C T A AGC A C T CTG GGG CTT CGA CCC TCT 1196

c m - 1

V a l T h r A s p C y s A s p I l e Val Arg Arq Ala Cys Glu S e r Val S e r T h r Arg Ala Ala H I S Met Cys Ser Ala Gly Leu A;a Gly Val Ile 390

G T C A C C G A C T G C G A C A T T GTG CGC C G T G C C T G T G A A A G C GTG T C C A C T C G C G C C G C C CAT ATG T G C T C C G C A GGA CTA GCT GGG GTC ATA 1286 Asn Arq Met Arg G l u S e r A r g S e r G l u Asp Val Met Arq Ile Thr Val Gly Val Asp Gly Ser Val Tyr Lys Leu H 1 s Pro S e r Phe Lys 420

A A T C G C ATG C G C G A A A G C CGC ACT GAG G A C G T G ATG CGC ATC A C T GTG G G C GTG GAT G G C T C C GTG T A C AAG CTG C A C CCG AGC TTC AAG 1376 CUB-2

G l u Arq P h e His Ala S e r Val Arq A r g Leu Thr Pro Asn Cys G 1 U Ile Thr Phe Ile GlU S e r Glu Glu Gly S e r Gly Arg Gly Ala Ala 450 G A G CGG T T T C A C G C C A C T GTG CGC AGG CTG A C A CCC A A C T G C G A A A T C ACC T T C ATC G A A T C A GAG GAG G G C AGC G G C AGG G G A GCC G C A 1466 Leu Val Ser Ala Val Ala Cy5 Lys Lys Ala Cy5 Met Leu Ala Gln ---

C T G G T C T C T G C G G T G G C C T G C A A G A A G G C T T G C A T G C T G G C C C A G T G A A A T C C A G G T C A T A T G G A C C G G G A C C T G G G T T C C A C G G G G A C T C C A C A C A C C A C A A 1569 4 6 5 ATGCTCCCAGCCCACCGGGCCAGGAGACCTATTCTGCTGCTACCCCTGGAAAATGGGGAGAGGCCCCTGCAAGCCGAGTCGGCCAGTGGGACAGCCCTAGGGCTCTCAGCCTGGGGCAG 1688 GGGGCTGGGAGGAAGAAGAGGATCAGAGGCGCCAAGGCCTTTCTTGCTAGAATCAACTACAGAAAATGGCGGAAAATACTCAGGACTTGCACTTTCACGATTCTTGCTTCCCAAGCGTG 1801 G G T C T G G C C T C C C A A G G G A A T G C T T C C T G G A C C T T G C ~ T G G C C T G G C T T C C C T G G G G G G G A C A C A C C T T C A T G G G G A G G T A A C T T C A G C A G T T C G G C C A G A C C A G A C C C C A G G A G A G T 1926 AAGGGCTGCTGGTCACCCAGACCTGGCTGTTTTCTTGTCTGTGGCTGAAGAGGCCGGGGAGCCATGAGAGACTGACTATCCGGCTACATGGAGAGGACTTTCCAGGCATGAACATGCCA 2045 GAGACTGTTGCCTTCATATACCTCCACCCGAGTGGCTTACAGTTCTGGGATGAACCCTCCCAGGAGATGCCAGAGGTTAGAGCCCCAGAGTCCTTGCTCTAAGGGGACCAGAAAGGGGA 2164 GGCCTCACTCTGCACTATTCAAGCAGGAATCATCTCCAACACTCAGGTCCCTGACCCAGGAGGAAGAAGCCACCCTCAGTGTCCCTCCAAGAGACCACCCAGGTCCTTCTCTCCCTCGT 2283 T C C C A A A T G C C A G C C T C T C T A C C T G G G A C T G T G G G G G A G T T T T T A A ~ T A T T T A A A A C T T I A , ~ 2346

FIG. 4. DNA and amino acid sequence of rat liver glucokinase. Bases are numbered from the 5’-end of

the cDNA sequence in GK.Z2. The in-phase termination codon is designated by dashed lines, and the putative polyadenylation signal is underlined. Amino acids are numbered starting with the most 5’-ATG. Amino acids identified by peptide sequencing are indicated by overlines and the peptide name is shown above each line. available from the characterization of the glucokinase gene,3

were used to simplify library screening. The DNA fragments selected for probes contained exons that encoded glucokinase peptides V8-3 and CNB-4 (Fig. 4); therefore, any clones obtained would contain inserts longer than GK1.

Three clones were identified when 5 x 10‘ plaques from a library enriched for glucokinase cDNA were screened. The cDNA inserts from all three clones were about 2400 bp in length (data not shown). One, GK.Z2, was slightly longer than

the other two, so its DNA sequence was determined. GK.Z2

contained additional sequences at both the 5’- and 3’-ends as compared to GK1. However, it also contained extraneous DNA at the 5‘-end which is due to a cloning artifact (data not shown). The sequence from GK.Z2 that corresponds to glucokinase cDNA is shown in Fig. 4. One of the other cDNA inserts isolated, GK.Zl, contained all of the sequence in GK1 and all but 14 bp of the 5”noncoding glucokinase cDNA

M. Magnuson et al., manuscript in preparation.

sequence found in GK.Z2A (data not shown). GK.Z2 contains 2346 bp plus a 54-base poly(A) tract (Figs. 3 and 4). The

sequence ATTAAA, located 18 nucleotides upstream from the

poly(A) tract, is the probable polyadenylation signal (31). Several criteria were used to select the translation initiation codon. First, the site selected is in good agreement with the initiation site consensus sequence described by Kozak (32). Second, the calculated protein mass of 51,924 daltons com- pares with the observed -50-kDa mass of glucokinase (Fig.

1). Third, all of the peptides obtained from purified glucoki-

nase (about 45% of the protein) are contained in the open reading frame of GK.Z2. Fourth, the size of the glucokinase cDNA (2346 bp) and a poly(A) tail of average length (about 100 nucleotides) predict an mRNA length of about 2.4 kb. This size corresponds closely with that observed on blot

transfers of RNA (Figs. 2 and 5), indicating that little or no

cDNA sequence remains unidentified. However, since the amino acid sequence from the amino-terminal peptide was

(4)

Rat

Liver

Glucokinase

cDNA

D I D I PEPCK GLUCOKINASE D X RNA STDS (kbl - I

-

-

9.5 7.5

-

4.4

-

2.4

-

1.4 CALMODULIN

FIG. 5. RNA blot transfer analysis. Poly(A)' RNA was isolated from diahetic rats (lanes D ) or diabetic rats treated with insulin for 4 h (lanes I). RNA was size-fractionated in the agarose/formaldehyde gel system described for Fig. 2, blot-transferred to nitrocellulose, and then hybridized with randomly primed cDNA probes. h / t , 2 pg of poly(A)+ RNA was probed with phosphoenolpyruvate carboxykinase

(PEPCK) cDNA PC116 (7); middle, 20 pg of poly(A)' RNA was

probed with GKI; right, 20 pg of poly(A)' RNA was probed with

calmodulin cDNA pCAM-9H8.

not obtained, presumably because it is blocked (data not

shown), the exact assignment of the initiating codon remains

tentative.

Comparison of Glucokinase with Yeast Hexokinase and Car-

boxyl-terminal Region of Rat Brain Hexokinase I-We com-

pared the predicted protein sequence for glucokinase with

that of yeast hexokinase (33, 34) and the carboxyl-terminal region of rat brain hexokinase I (35). Glucokinase shared 53%

identity with rat brain hexokinase I and 33% amino acid

sequence identity with yeast hexokinase PI as determined by

the DNAStar program AALIGN (36), which utilizes the

PAM250 matrix described by Lipman and Pearson (37). This program compares closely related proteins and determines the probability that a given amino acid would be replaced by any

other amino acid over a fixed period of time. Empirical studies

have shown that this matrix allows for the detection of more distant relationships than when only identities or minimum

base changes are considered. In addition to sequence identity,

there are 116 residues in the rat brain carboxyl-terminal

portion of hexokinase I that represent conservative. amino

acid replacements when compared with glucokinase and 139

residues in glucokinase that represent conservative amino

acid replacements when compared with yeast hexokinase PI. Considering these changes, glucokinase is identical or similar to the carboxyl-terminal portions of brain hexokinase I and

yeast hexokinase PI at 75 and 63% of positions, respectively.

DISCUSSION

Glucokinase mRNA-The protocol employed permitted us to obtain amounts of highly purified glucokinase sufficient for the analysis of a significant portion (-45%) of the amino acid sequence. This sequence information led to the construc-

tion of oligonucleotide probes that resulted in the isolation of

an authentic cDNA"~. The availability of the nearly full- length cDNA, GK.Z2, allowed us to derive the structure of

mRNAGK. The portion of mRNAGK we have cloned is com-

posed of 2,346 nucleotides, exclusive of the poly(A) tail. It contains an open reading frame of 1,395 nucleotides flanked on the 3'-end by 835 nucleotides and on the 5'-end by 116

nucleotides (GK.Z2). The mRNAGK codes for a protein of 465

amino acids with a calculated molecular mass of 51,924 Da. The methionine chosen as the putative initiation site fits

nRxr CK YA llEXl CK YA IIP.XI CK YA IlEXl GK YA IlEXl CK YA IlEXI CK YA m x r GK YA twxr CK YA ATP DINDING SKA7VKMI.FSFVRSIP~RIICDF1A1.1)1CI~FRVI.I.VKIRS6K ?<\\\\\\\\\\\\\\\\\\" KRTVEMllNKIYS1PI.Rl E E A S ~ M l . P T T V R S T F E C S R V C I ) F I . S I . D I C G R I F I PEDA KCVHIPMlPCWMEFPTGKPS~~NYlAI

I:IIIII::III I:I:I I I I I : I I I I I I I l I I : I I I : I :::I :::Ill1 : :: I:I I I I I I::I::IIIIII:II:III: :I::: I : I :I:I

CI.UCI)SE ~INI)IN(:

... ... ...

M W T CDRI.FDIIIVSCIS1)FI. 1)YMCIKCP R U P l . C F T F S F P C I I O T H I . ~ C l I . I S ~ C F K A T D KT~-AEMI.FI)YISRCISDFI. M(II0WIIK KI.PICFTFSFFVRIIEDI.M(CII.IHYTRGFRASC

R ~ O E E l . V S F I A D S l ~ 0 F U V E O E l . l ~ I ~ I . ~ I ~ ~ ~ F S T P A S O , ~ I ~ ~ ~ I I ~ ~ R , ~ K ~ ~ D I P N

I I I :: I l l I IIIIII I :I ::IIIlIIIII :: :I1 l l l : : l l l l l l l : :

I I I :I:: : 1 1 : ::: :: I I I J I I I I I : I :: :: I l l I I I I I :

... ...

FIG. 6. Comparison of glucokinase and other hexokinases.

The amino acid sequences of rat brain hexokinase I ( H E X I ) , gluco- kinase ( C K ) , and yeast hexokinase PI ( Y A ) were aligned by the DNAStar program AALIGN. Vertical lines indicate sequence identity, and the uertical dots indicate conservative amino acid replacements (36,37). The slashed-line border designates the putative ATP-binding domains in these three proteins. The rnultidot borders designate the putative glucose-binding domains.

more closely with a Kozak (32) consensus than does a second

methionine located 2 amino acids away.

The 30-fold induction of mRNAGK resulting from insulin treatment of diabetic rats substantiates the observations of Spence (7) and Sibrowski and Seitz (6), who quantitated mRNACK activity using in uitro translation of mRNA with immunoprecipitation of the labeled glucokinase by polyclonal

antibodies. Our data also are in agreement with those of

Iynedjian et al. (9), who showed that insulin treatment of

diabetic rats stimulated the expression of a 2.4-kb mRNA.

Iynedjian et al. used a 1700-bp probe isolated from a X g t l l

plasmid by an antibody directed against glucokinase (8). It

appears that each of our cDNAs hybridizes to an mRNA of the same size and that this mRNA is regulated in a similar fashion.

Amino Acid Homology-The availability of sequence infor- mation for glucokinase made it possible to test directly the hypothesis that the hexokinases share sequence homology. Mammalian glucokinase and yeast hexokinases PI and PI1 are each composed of a single polypeptide chain of approxi-

mately 50 kDa, and they are not subject to allosteric regulation

by glucose 6-phosphate. In contrast, mammalian hexokinases 1-111 consist of a single polypeptide chain of molecular mass

(5)

Rat

Liver

Glucokinase cDNA

SOURCE ENZYME LOCATION

367

GLUCOSE BINDING DOMAIN

FIG. 7. Putative glucose- and

ATP-binding domains. Top, the re-

gion depicted is thought to be important for glucose binding to the yeast hexoki- nases (39,40). The amino acid sequences

of glucokinase, rat brain hexokinase I,

and yeast hexokinases PI and PI1 were aligned by the DNAStar program AA- LIGN (36). The amino acid sequence of peptides 5 and 6 from rat skeletal muscle hexokinase I11 was matched by inspec- tion (41). Bottom, the core sequence of a putative ATP-binding domain of the protein kinase catalytic subunit from bo- vine skeletal muscle and other protein kinases (41-44) was matched by inspec- tion with the amino acid sequences of the hexokinases. An essential lysine lo- cated ll residues away from the core sequence in the protein kinase was aligned, by inspection, with a lysine in each of the hexokinases.

Rat Glucokinase 144-171 Rat Hexokinase I C-Domain Rat Hexokinase I11 C-Domain Yeast Hexokinase PI 151-178 Yeast Hexokinase PI1 151-178

SOURCE ENZYME LOCATION Rat Glucokinase 78-102 Rat Hexokinase I C-Domain Yeast Hexokinase PI 86-111 Yeast Hexokinase PI1 86-111 Bovine Protein Kinase 37-61

(CAMP dependent)

-100 kDa and are allosterically inhibited by the reaction

product, glucose 6-phosphate (21). The evolutionary relation-

ship between these proteins was first hypothesized when

antigenic cross-reactivity between yeast and mammalian hex-

okinases was determined using polyclonal antibodies (11).

This concept was further strengthened when peptide maps of hexokinases I and I11 established the existence of two separate

domains in the low K,,, enzymes. An amino-terminal domain

contains the glucose 6-phosphate-binding or allosteric site (38), and a carboxyl-terminal domain contains both the glu- cose and ATP-binding sites (38). As shown in Fig. 6, there is a high degree of sequence identity among yeast hexokinase, glucokinase, and the relevant regions of at least one mam- malian hexokinase (brain, type I). The hypothesis that the

100-kDa hexokinases evolved as a result of a gene duplication

and fusion event involving a lower molecular mass hexoki- nase, where one catalytic site was retained and the other

evolved into a regulatory site, would predict this kind of

similarity at the carboxyl-terminal end of the molecules. Sequence analysis of the amino-terminal portion of the high molecular mass mammalian hexokinases is the next critical

step in testing this hypothesis.

Glucose-binding Domain-It is apparent that several re-

gions of glucokinase show a high degree of sequence conser- vation with other hexokinases (Fig. 6). We attempted to relate these regions to functional domains that have been defined in the yeast and brain hexokinases. Determination of the tertiary protein structure of the yeast hexokinase isoenzymes in the presence of glucose and glucose analogs identified specific amino acid residues that participate in the binding of

glucose (39,40). Ser15*, Asp211, GluZ6', and Glu302 all hydrogen-

bond to the hydroxyl groups of glucose (40). All of these residues appear to be conserved and in similar locations in

rat liver glucokinase, mammalian hexokinase I11 (41), and the

carboxyl-terminal region of rat brain hexokinase I (35) (Fig. 6). Although the basic features of the glucose-binding domain

are present in all of the hexokinases, other amino acid residues

must be involved in glucokinase, which binds glucose with a

much lower affinity than do the mammalian hexokinases (2).

ATP-binding Domain-Various techniques have been used

to define the ATP-binding site in several proteins (40-44). A

similar region is found in glucokinase (Fig. 6). The corre- sponding sequences of bovine heart protein kinase (residues 37-61), glucokinase (residues 78-102), rat brain hexokinase I

ATP BINDING DOMAIN

(C-domain), and yeast hexokinase (86-111) are shown in detail in Fig. 7. This putative ATP-binding domain has several conserved features, the most invariant of which is a lysine located 11-14 residues, in the carboxyl-terminal direction, from the core sequence. There are no data that support the hypothesis that these residues participate in the binding of

ATP to glucokinase, but the highly conserved sequence cer-

tainly suggests that this is possible.

Concluding Remarks-The isolation of an authentic cDNA for hepatic glucokinase makes further studies of the structure and function of this enzyme feasible. Furthermore, since mRNAGK is regulated by insulin and glucagon in the liver, a cDNAGK enables us to isolate and characterize the DNA

sequences in the gene that might be responsible for this

regulation. Further studies in this direction are now underway

and may contribute to our understanding of the function and regulation of this important enzyme.

Acknowledgments-We thank Mark Lively (Oncology Research Center, Bowman Gray Medical School) and Tom Lucas and Paul Matrisian (Howard Hughes Medical Institute, Vanderbilt Univer- sity), who helped perform the amino acid sequence analysis. Warren Zimmer provided the chicken calmodulin cDNA (pCAM-9H8). Pa- trick Quinn provided a critical analysis of the manuscript. The expert technical assistance of Steve Koch and Elisabeth Zimmerman is gratefully acknowledged, as is the help Deborah Caplenor provided in preparing the manuscript.

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. REFERENCES

Ureta, T. (1982) Comp. Biochem. Physiol. B Comp. Biochem. 71, Weinhouse, S. (1976) Curr. Top. Cell. Regul. 11, 1-50

Meglasson, M. D., and Matschinsky, F. M. (1986) Diabetes Metab. Spence, J. T., and Pitot, H. C. (1979) J. Biol. Chem. 254,12331- Sharma, C., Manjeshwar, R., and Weinhouse, S. (1963) J. Biol.

Sibrowski, W., and Seitz, H. J. (1984) J. Biol. Chem. 259, 343- Spence, J. T. (1983) J. Biol. Chem. 258,9143-9146

Iynedjian, P. B., Ulca, C., and Mach, B. (1987) J. Biol. Chem. Iynedjian, P. B., Gjinovci, A., and Renold, A. E. (1988) J. Biol. Trayer, I. P., and Darby, M. K. (1981) Biochem. SOC. Trans. 9, Lawrence, G. M., and Trayer, I. P. (1984) Comp. Biochem. Physiol.

549-555 Rev. 2,163-214 12336 Chem. 238,3840-3845 346 262,6032-6038 Chem. 263, 740-744 62 B Comp. Biochem. 79, 233-238

(6)

12. Storer, A. C., and Cornish-Bowden, A. (1976) Biochem. J. 1 5 9 ,

7-14

13. Allen, M. B., and Walker, D. G. (1980) Biochem. J. 1 8 5 , 565-

582

14. Holroyde, M. J., Allen, M. B., Storer, A. C., Warsy, A. S., Cheser, M. E., Trayer, I. P., Cornish-Bowlen, A., and Walker, D. G. (1976) Biochem. J. 153,363-373

15. Seelig, G. F., and Colman, R. F. (1977) J. Biol. Chem. 252,3671- 3678

16. Wilson, J. E. (1976) Biochem. Biophys. Res. Commun. 7 2 , 816- 823

17. Holroyde, M. J., Chesher, J. M. E., Trayer, I. P., and Walker, D. G. (1976) Biochem. J. 153,351-361

18. Maniatas, T., Fritsch, E. F., and Sambrook, J. (1982) Molecular Cloning; A Laboratory Manual, Cold Spring Harbor Laboratory,

Cold Spring Harbor, NY

19. Beale, E., Andreone, T., Koch, S., Granner, M., and Granner, D. (1984) Diabetes 3 3 , 328-332

20. Danis, L. G., Dibner, M. D., and Battey, J. R. (1986) Basic Methods in Molecular Biology, Elsevier Scientific Publishing

Co., Inc., New York

21. Grossbard, L., and Schimke, R. T. (1966) J. Biol. Chem. 2 4 1 , 22. Lawrence, G. M., Walker, D. G., and Trayer, I. P. (1983) Biochim.

23. Allen, B., Brockelbank, J. L., and Walker, D. G. (1980) Biochim.

24. Cardenas, M. L., Rabajille, E., and Niemeyer, H. (1984) Eur. J.

25. Parry, M. J., and Walker, D. G. (1966) Biochem. J. 9 9 , 266-274

26. Salas, J., Salas, M., Vinuela, E., and Sols, A. (1965) J. Biol. Chem.

27. Easterby, J. S., and O'Brien, M. H. (1973) Eur. J. Biochem. 3 8 , 28. Meglasson, M. D., Trueheart Burch, P., Berner, D. K., Najafi,

3546-3560 Biophys. Acta 743,219-225 Biophys. Acta 614, 357-366 Biochem. 1 4 5 , 163-171 240,1014-1015 201-211 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45.

H., Vogin, A. P., and Matschinsky, F. M. (1983) Proc. Natl. Acad. Sci. U. S. A. 80, 85-89

Ohtsuka, E., Matsuki, S., Ikehara, M., Takahashi, Y., and Mat- subara, K. (1985) J. Biol. Chem. 260,2605-2608

Morley, J. E., Levine, A. S., Brown, D. M., and Handwerger, B. S. (1982) Biochem. Biophys. Res. Commun. 108, 1418-1423 Wickens, M., and Stephenson, P. (1984) Science 226,1045-1051 Kozak, M. (1984) Nucleic Acids Res. 1 2 , 857-872

Kopetzki, E., Entian, K. D., and Mecke, D. (1985) Gene (Amst.)

Frohlich, K. U., Entian, K. D., and Mecke, D. (1985) Gene (Amst.)

Schwab, D. A., and Wilson, J. E. (1988) J. Bid. Chem. 2 6 3 , DNAStar Version 1.65 (1988) Comprehensiue Microcomputer

Systems for Molecular Biology, DNAStar, Inc., Madison, WI

Lipman, D. J., and Pearson, W. R. (1985) Science 227, 1435- 1441

Wilson, J. E., and Nemat-Gorgani, M. (1986) Arch. Biochem. Biophys. 251,97-103

Steitz, T. A., and Bennett, W. S., Jr. (1980) J. Mol. Biol. 1 4 0 , 211-230

Harrison, R. (1985) Crystallographic Refinement of Two Isozymes of Yeast Hexokinase and Relationship of Structure to Function.

Ph.D. thesis, Yale University

Hanks, S. K., Quinn, A. M., and Hunter, T. (1988) Science 2 4 1 , Zoller, M. J., Nelson, N. C., and Taylor, S. S. (1981) J. Biol.

Hashimoto, E., Takio, K., and Krebs, E. (1982) J. Biol. Chem.

Kamps, M. P., Taylor, S. S., and Sefton, B. M. (1984) Nature

Beale, E. G., Chrapkiewicz, N. B., Scoble, H. A., Metz, R. J., Quick, D. P., Noble, R. L., Donelson, J. E., Biemann, K., and Granner, D. K. (1985) J. Biol. Chem. 2 6 0 , 10748-10760 39,95-102 36,105-111 3220-3224 42-52 Chem. 256,10837-10842 257,727-733 310,589-592

The Amino S I P P r n h L M m A l Acid Sequence of Rat Llver m Glucokinase. Deduced from Cloned cDNA

by

Teresa L. andreone, Richard L. P r i n t i . Slnon J . PIIkis'.

lark A . lagnuson and Daryl K . Granner*

Homagemzarlon 110 5 2 PEG 1570 I00 0.W92 1940 DEAE ~ ~ l h l ~ r e 2 . 2 5 620 39 10 B l u e repharose 0.66 114 30 0 . 7 2 18 Blue SeDharoSe 0.005 181 85 21 9.200 120 1.0 0.0161 1.7 0.28 1 2 1 HPLC + rfflniry 0.0011 210 13 180 21.000

(7)

Rat

Liver Glucokinase cDNA

369

1 2 3 1 -2 3

rructosc 1.1 1.1 1.3 0.2 0.0 0.2 mn"o,c 0.7 1.1 1.0 0.9 1.0 0.9 " 1 C C I y I g I Y C O Y . I N 0.2 1.0 0.2 1.0 0.1 1.0 0.0 0.0 0.0 0.0 0 . 2 0.0 Gal.ClOSe

H E P A T I C P O L Y ( * ) ' R N A PROBED WITH GKOLIGOMERS

0 1

.

D X I RNA STOS (kbl

.a.

-

-

9.5

-

7 . 5

Figure

Updating...

References